The data is related with direct marketing campaigns (phone calls) of a Portuguese banking institution. The classification goal is to predict if the client will subscribe a term deposit (variable y).

Data available at UCI machine learning repository here

Citation Request: [Moro et al., 2014] S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014

Input variables: The data is related with direct marketing campaigns of a Portuguese banking institution. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was required, in order to access if the product (bank term deposit) would be (or not) subscribed.

The classification goal is to predict if the client will subscribe a term deposit (variable y).

  1. Number of Instances: 45211 for bank-full.csv (4521 for bank.csv)

  2. Number of Attributes: 16 + output attribute.

  3. Attribute information:

For more information, read [Moro et al., 2011].

Input variables: # bank client data: 1 - age (numeric) 2 - job : type of job (categorical: “admin.”,“unknown”,“unemployed”,“management”,“housemaid”,“entrepreneur”,“student”, “blue-collar”,“self-employed”,“retired”,“technician”,“services”) 3 - marital : marital status (categorical: “married”,“divorced”,“single”; note: “divorced” means divorced or widowed) 4 - education (categorical: “unknown”,“secondary”,“primary”,“tertiary”) * 5 - default: has credit in default? (binary: “yes”,“no”) 6 - balance: average yearly balance, in euros (numeric) 7 - housing: has housing loan? (binary: “yes”,“no”) 8 - loan: has personal loan? (binary: “yes”,“no”) # related with the last contact of the current campaign: 9 - contact: contact communication type (categorical: “unknown”,“telephone”,“cellular”) 10 - day: last contact day of the month (numeric) 11 - month: last contact month of year (categorical: “jan”, “feb”, “mar”, …, “nov”, “dec”) 12 - duration: last contact duration, in seconds (numeric) # other attributes: 13 - campaign: number of contacts performed during this campaign and for this client (numeric, includes last contact) 14 - pdays: number of days that passed by after the client was last contacted from a previous campaign (numeric, -1 means client was not previously contacted) 15 - previous: number of contacts performed before this campaign and for this client (numeric) *16 - poutcome: outcome of the previous marketing campaign (categorical: “unknown”,“other”,“failure”,“success”)

Output variable (desired target): *17 - y - has the client subscribed a term deposit? (binary: “yes”,“no”)

*8. Missing Attribute Values: None

pacman::p_load("lubridate", "dplyr", "magrittr")
library(rio)
library(doParallel)
library(viridis)
library(RColorBrewer)
library(tidyverse)
library(ggthemes)
library(knitr)
library(tidyverse)
library(caret)
library(caretEnsemble)
library(plotly)
library(lime)
library(plotROC)
# Calculate the number of cores
no_cores <- detectCores() - 1
cl<-makeCluster(no_cores)
registerDoParallel(cl)
setwd("/Users/nanaakwasiabayieboateng/Documents/memphisclassesbooks/DataMiningscience")
#load excel file with rio
Data<- rio::import("/Users/nanaakwasiabayieboateng/Documents/memphisclassesbooks/DataMiningscience/Anomalydetection/bank/bank-full.csv")
Warning: closing unused connection 9 (<-localhost:11278)
Warning: closing unused connection 8 (<-localhost:11278)
Warning: closing unused connection 7 (<-localhost:11278)
Warning: closing unused connection 6 (<-localhost:11278)
Warning: closing unused connection 5 (<-localhost:11278)
Warning: closing unused connection 4 (<-localhost:11278)
Warning: closing unused connection 3 (<-localhost:11278)
Data%>%head
#%>%kable()
summary(Data)
      age            job              marital           education           default         
 Min.   :18.00   Length:45211       Length:45211       Length:45211       Length:45211      
 1st Qu.:33.00   Class :character   Class :character   Class :character   Class :character  
 Median :39.00   Mode  :character   Mode  :character   Mode  :character   Mode  :character  
 Mean   :40.94                                                                              
 3rd Qu.:48.00                                                                              
 Max.   :95.00                                                                              
    balance         housing              loan             contact               day       
 Min.   : -8019   Length:45211       Length:45211       Length:45211       Min.   : 1.00  
 1st Qu.:    72   Class :character   Class :character   Class :character   1st Qu.: 8.00  
 Median :   448   Mode  :character   Mode  :character   Mode  :character   Median :16.00  
 Mean   :  1362                                                            Mean   :15.81  
 3rd Qu.:  1428                                                            3rd Qu.:21.00  
 Max.   :102127                                                            Max.   :31.00  
    month              duration         campaign          pdays          previous       
 Length:45211       Min.   :   0.0   Min.   : 1.000   Min.   : -1.0   Min.   :  0.0000  
 Class :character   1st Qu.: 103.0   1st Qu.: 1.000   1st Qu.: -1.0   1st Qu.:  0.0000  
 Mode  :character   Median : 180.0   Median : 2.000   Median : -1.0   Median :  0.0000  
                    Mean   : 258.2   Mean   : 2.764   Mean   : 40.2   Mean   :  0.5803  
                    3rd Qu.: 319.0   3rd Qu.: 3.000   3rd Qu.: -1.0   3rd Qu.:  0.0000  
                    Max.   :4918.0   Max.   :63.000   Max.   :871.0   Max.   :275.0000  
   poutcome              y            
 Length:45211       Length:45211      
 Class :character   Class :character  
 Mode  :character   Mode  :character  
                                      
                                      
                                      

There are no missing observations in the data.

#==================================================================
#check the number of missing rows
#==================================================================
colSums(is.na.data.frame(Data))
      age       job   marital education   default   balance   housing      loan   contact       day 
        0         0         0         0         0         0         0         0         0         0 
    month  duration  campaign     pdays  previous  poutcome         y 
        0         0         0         0         0         0         0 
#==================================================================
# descriptive/summary statistics
#==================================================================
Hmisc::describe.data.frame(Data)   
Data 

 17  Variables      45211  Observations
--------------------------------------------------------------------------------------------------------
age 
       n  missing distinct     Info     Mean      Gmd      .05      .10      .25      .50      .75 
   45211        0       77    0.999    40.94    11.87       27       29       33       39       48 
     .90      .95 
      56       59 

lowest : 18 19 20 21 22, highest: 90 92 93 94 95
--------------------------------------------------------------------------------------------------------
job 
       n  missing distinct 
   45211        0       12 
                                                                                              
Value             admin.   blue-collar  entrepreneur     housemaid    management       retired
Frequency           5171          9732          1487          1240          9458          2264
Proportion         0.114         0.215         0.033         0.027         0.209         0.050
                                                                                              
Value      self-employed      services       student    technician    unemployed       unknown
Frequency           1579          4154           938          7597          1303           288
Proportion         0.035         0.092         0.021         0.168         0.029         0.006
--------------------------------------------------------------------------------------------------------
marital 
       n  missing distinct 
   45211        0        3 
                                     
Value      divorced  married   single
Frequency      5207    27214    12790
Proportion    0.115    0.602    0.283
--------------------------------------------------------------------------------------------------------
education 
       n  missing distinct 
   45211        0        4 
                                                  
Value        primary secondary  tertiary   unknown
Frequency       6851     23202     13301      1857
Proportion     0.152     0.513     0.294     0.041
--------------------------------------------------------------------------------------------------------
default 
       n  missing distinct 
   45211        0        2 
                      
Value         no   yes
Frequency  44396   815
Proportion 0.982 0.018
--------------------------------------------------------------------------------------------------------
balance 
       n  missing distinct     Info     Mean      Gmd      .05      .10      .25      .50      .75 
   45211        0     7168        1     1362     2054     -172        0       72      448     1428 
     .90      .95 
    3574     5768 

lowest :  -8019  -6847  -4057  -3372  -3313, highest:  66721  71188  81204  98417 102127
--------------------------------------------------------------------------------------------------------
housing 
       n  missing distinct 
   45211        0        2 
                      
Value         no   yes
Frequency  20081 25130
Proportion 0.444 0.556
--------------------------------------------------------------------------------------------------------
loan 
       n  missing distinct 
   45211        0        2 
                      
Value         no   yes
Frequency  37967  7244
Proportion  0.84  0.16
--------------------------------------------------------------------------------------------------------
contact 
       n  missing distinct 
   45211        0        3 
                                        
Value       cellular telephone   unknown
Frequency      29285      2906     13020
Proportion     0.648     0.064     0.288
--------------------------------------------------------------------------------------------------------
day 
       n  missing distinct     Info     Mean      Gmd      .05      .10      .25      .50      .75 
   45211        0       31    0.999    15.81    9.576        3        5        8       16       21 
     .90      .95 
      28       29 

lowest :  1  2  3  4  5, highest: 27 28 29 30 31
--------------------------------------------------------------------------------------------------------
month 
       n  missing distinct 
   45211        0       12 
                                                                                  
Value        apr   aug   dec   feb   jan   jul   jun   mar   may   nov   oct   sep
Frequency   2932  6247   214  2649  1403  6895  5341   477 13766  3970   738   579
Proportion 0.065 0.138 0.005 0.059 0.031 0.153 0.118 0.011 0.304 0.088 0.016 0.013
--------------------------------------------------------------------------------------------------------
duration 
       n  missing distinct     Info     Mean      Gmd      .05      .10      .25      .50      .75 
   45211        0     1573        1    258.2    235.4       35       58      103      180      319 
     .90      .95 
     548      751 

lowest :    0    1    2    3    4, highest: 3366 3422 3785 3881 4918
--------------------------------------------------------------------------------------------------------
campaign 
       n  missing distinct     Info     Mean      Gmd      .05      .10      .25      .50      .75 
   45211        0       48    0.918    2.764    2.383        1        1        1        2        3 
     .90      .95 
       5        8 

lowest :  1  2  3  4  5, highest: 50 51 55 58 63
--------------------------------------------------------------------------------------------------------
pdays 
       n  missing distinct     Info     Mean      Gmd      .05      .10      .25      .50      .75 
   45211        0      559    0.454     40.2    71.61       -1       -1       -1       -1       -1 
     .90      .95 
     185      317 

lowest :  -1   1   2   3   4, highest: 838 842 850 854 871
--------------------------------------------------------------------------------------------------------
previous 
       n  missing distinct     Info     Mean      Gmd      .05      .10      .25      .50      .75 
   45211        0       41    0.454   0.5803    1.044        0        0        0        0        0 
     .90      .95 
       2        3 

lowest :   0   1   2   3   4, highest:  41  51  55  58 275
--------------------------------------------------------------------------------------------------------
poutcome 
       n  missing distinct 
   45211        0        4 
                                          
Value      failure   other success unknown
Frequency     4901    1840    1511   36959
Proportion   0.108   0.041   0.033   0.817
--------------------------------------------------------------------------------------------------------
y 
       n  missing distinct 
   45211        0        2 
                      
Value         no   yes
Frequency  39922  5289
Proportion 0.883 0.117
--------------------------------------------------------------------------------------------------------
#describe(Data)
#==================================================================
# Histograms
#==================================================================
theme_set(theme_economist_white())
#ggplot(Data) + geom_boxplot(aes(x =age,y=duration,color=y))
ggplot(Data, aes(x ="",y=age, fill=y))+ geom_boxplot()+labs(x="age",y="")

#ggplotly(p)
ggplot(Data, aes(x =duration, fill=y))+ geom_histogram(bins = 30)

#ggplotly()
ggplot(Data, aes(x =age, fill=y))+ geom_histogram(bins = 30)

# ggplotly()
 
ggplot(Data, aes(x =day, fill=y))+ geom_histogram(bins = 30)

#ggplotly()
ggplot(Data, aes(x =balance, fill=y))+ geom_histogram(bins = 30)

#ggplotly()
ggplot(Data, aes(x =age, fill=y))+ geom_histogram(bins = 30)

# ggplotly() 
 # geom_density(alpha=1/3,color="red") + scale_fill_hue()
ggplot(Data, aes(x=age, fill=y)) + geom_density(alpha=1/3) + scale_fill_hue()

ggplotly()

NA

Convert character variables to factor variables.This is neccessary for the caret package to train the models we are interested in later.

Data<-Data %>% mutate_if(is.character, as.factor) 
str(Data)
'data.frame':   45211 obs. of  17 variables:
 $ age      : int  58 44 33 47 33 35 28 42 58 43 ...
 $ job      : Factor w/ 12 levels "admin.","blue-collar",..: 5 10 3 2 12 5 5 3 6 10 ...
 $ marital  : Factor w/ 3 levels "divorced","married",..: 2 3 2 2 3 2 3 1 2 3 ...
 $ education: Factor w/ 4 levels "primary","secondary",..: 3 2 2 4 4 3 3 3 1 2 ...
 $ default  : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 2 1 1 ...
 $ balance  : int  2143 29 2 1506 1 231 447 2 121 593 ...
 $ housing  : Factor w/ 2 levels "no","yes": 2 2 2 2 1 2 2 2 2 2 ...
 $ loan     : Factor w/ 2 levels "no","yes": 1 1 2 1 1 1 2 1 1 1 ...
 $ contact  : Factor w/ 3 levels "cellular","telephone",..: 3 3 3 3 3 3 3 3 3 3 ...
 $ day      : int  5 5 5 5 5 5 5 5 5 5 ...
 $ month    : Factor w/ 12 levels "apr","aug","dec",..: 9 9 9 9 9 9 9 9 9 9 ...
 $ duration : int  261 151 76 92 198 139 217 380 50 55 ...
 $ campaign : int  1 1 1 1 1 1 1 1 1 1 ...
 $ pdays    : int  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
 $ previous : int  0 0 0 0 0 0 0 0 0 0 ...
 $ poutcome : Factor w/ 4 levels "failure","other",..: 4 4 4 4 4 4 4 4 4 4 ...
 $ y        : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...
ggplot(Data, aes(x =y))+ geom_histogram(bins = 4,stat="count")+labs(x="Term Deposit")
Ignoring unknown parameters: binwidth, bins, pad

#==================================================================
#Converting outcome variable to numeric
#==================================================================
Data$y<-ifelse(Data$y=='no',0,1)
str(Data)
'data.frame':   45211 obs. of  17 variables:
 $ age      : int  58 44 33 47 33 35 28 42 58 43 ...
 $ job      : chr  "management" "technician" "entrepreneur" "blue-collar" ...
 $ marital  : chr  "married" "single" "married" "married" ...
 $ education: chr  "tertiary" "secondary" "secondary" "unknown" ...
 $ default  : chr  "no" "no" "no" "no" ...
 $ balance  : int  2143 29 2 1506 1 231 447 2 121 593 ...
 $ housing  : chr  "yes" "yes" "yes" "yes" ...
 $ loan     : chr  "no" "no" "yes" "no" ...
 $ contact  : chr  "unknown" "unknown" "unknown" "unknown" ...
 $ day      : int  5 5 5 5 5 5 5 5 5 5 ...
 $ month    : chr  "may" "may" "may" "may" ...
 $ duration : int  261 151 76 92 198 139 217 380 50 55 ...
 $ campaign : int  1 1 1 1 1 1 1 1 1 1 ...
 $ pdays    : int  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
 $ previous : int  0 0 0 0 0 0 0 0 0 0 ...
 $ poutcome : chr  "unknown" "unknown" "unknown" "unknown" ...
 $ y        : num  0 0 0 0 0 0 0 0 0 0 ...
glimpse(Data)
Observations: 45,211
Variables: 17
$ age       <int> 58, 44, 33, 47, 33, 35, 28, 42, 58, 43, 41, 29, 53, 58, 57, 51, 45, 57, 60, 33, 2...
$ job       <chr> "management", "technician", "entrepreneur", "blue-collar", "unknown", "management...
$ marital   <chr> "married", "single", "married", "married", "single", "married", "single", "divorc...
$ education <chr> "tertiary", "secondary", "secondary", "unknown", "unknown", "tertiary", "tertiary...
$ default   <chr> "no", "no", "no", "no", "no", "no", "no", "yes", "no", "no", "no", "no", "no", "n...
$ balance   <int> 2143, 29, 2, 1506, 1, 231, 447, 2, 121, 593, 270, 390, 6, 71, 162, 229, 13, 52, 6...
$ housing   <chr> "yes", "yes", "yes", "yes", "no", "yes", "yes", "yes", "yes", "yes", "yes", "yes"...
$ loan      <chr> "no", "no", "yes", "no", "no", "no", "yes", "no", "no", "no", "no", "no", "no", "...
$ contact   <chr> "unknown", "unknown", "unknown", "unknown", "unknown", "unknown", "unknown", "unk...
$ day       <int> 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, ...
$ month     <chr> "may", "may", "may", "may", "may", "may", "may", "may", "may", "may", "may", "may...
$ duration  <int> 261, 151, 76, 92, 198, 139, 217, 380, 50, 55, 222, 137, 517, 71, 174, 353, 98, 38...
$ campaign  <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
$ pdays     <int> -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -...
$ previous  <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
$ poutcome  <chr> "unknown", "unknown", "unknown", "unknown", "unknown", "unknown", "unknown", "unk...
$ y         <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...

Convert Categorical varialbes to dummy variables using either Model.matrix or sparse.model,matrix

predictors<-setdiff(names(Data),Data$y)
#names(Data)
predictors<-names(Data[,-17])
#paste0(predictors,sep="",collapse = "+ ")
#paste("~",paste0(predictors,sep="",collapse = "+ "))
as.formula(paste("~",paste0(predictors,sep="",collapse = "+ ")))
~age + job + marital + education + default + balance + housing + 
    loan + contact + day + month + duration + campaign + pdays + 
    previous + poutcome
d1<-model.matrix(as.formula(paste("~",paste0(predictors,sep="",collapse = "+ "))), Data)
d1b<-Matrix::sparse.model.matrix(as.formula(paste("~",paste0(predictors,sep="",collapse = "+ "))), Data)
#d1<-model.matrix(~age+job+marital+education + default + balance + housing + 
#    loan + contact + day + month + duration + campaign + pdays + 
#    previous + poutcome,Data)
#d1b<-Matrix::sparse.model.matrix(~age+job+marital+education + default + balance + housing + 
#    loan + contact + day + month + duration + campaign + pdays + 
#    previous + poutcome,Data)
head(d1)
  (Intercept) age jobblue-collar jobentrepreneur jobhousemaid jobmanagement jobretired jobself-employed
1           1  58              0               0            0             1          0                0
2           1  44              0               0            0             0          0                0
3           1  33              0               1            0             0          0                0
4           1  47              1               0            0             0          0                0
5           1  33              0               0            0             0          0                0
6           1  35              0               0            0             1          0                0
  jobservices jobstudent jobtechnician jobunemployed jobunknown maritalmarried maritalsingle
1           0          0             0             0          0              1             0
2           0          0             1             0          0              0             1
3           0          0             0             0          0              1             0
4           0          0             0             0          0              1             0
5           0          0             0             0          1              0             1
6           0          0             0             0          0              1             0
  educationsecondary educationtertiary educationunknown defaultyes balance housingyes loanyes
1                  0                 1                0          0    2143          1       0
2                  1                 0                0          0      29          1       0
3                  1                 0                0          0       2          1       1
4                  0                 0                1          0    1506          1       0
5                  0                 0                1          0       1          0       0
6                  0                 1                0          0     231          1       0
  contacttelephone contactunknown day monthaug monthdec monthfeb monthjan monthjul monthjun monthmar
1                0              1   5        0        0        0        0        0        0        0
2                0              1   5        0        0        0        0        0        0        0
3                0              1   5        0        0        0        0        0        0        0
4                0              1   5        0        0        0        0        0        0        0
5                0              1   5        0        0        0        0        0        0        0
6                0              1   5        0        0        0        0        0        0        0
  monthmay monthnov monthoct monthsep duration campaign pdays previous poutcomeother poutcomesuccess
1        1        0        0        0      261        1    -1        0             0               0
2        1        0        0        0      151        1    -1        0             0               0
3        1        0        0        0       76        1    -1        0             0               0
4        1        0        0        0       92        1    -1        0             0               0
5        1        0        0        0      198        1    -1        0             0               0
6        1        0        0        0      139        1    -1        0             0               0
  poutcomeunknown
1               1
2               1
3               1
4               1
5               1
6               1
head(d1b)
[1] 1 1 1 1 1 1

The dummy conversion results in 42 variables.

#==================================================================
#convert categorical variables  to numeric variables
#==================================================================
dmy <- dummyVars(" ~ .", data = Data,fullRank = T)
transformed <- data.frame(predict(dmy, newdata =Data))
#Checking the structure of transformed train file
str(transformed)
'data.frame':   45211 obs. of  43 variables:
 $ age               : num  58 44 33 47 33 35 28 42 58 43 ...
 $ jobblue.collar    : num  0 0 0 1 0 0 0 0 0 0 ...
 $ jobentrepreneur   : num  0 0 1 0 0 0 0 1 0 0 ...
 $ jobhousemaid      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ jobmanagement     : num  1 0 0 0 0 1 1 0 0 0 ...
 $ jobretired        : num  0 0 0 0 0 0 0 0 1 0 ...
 $ jobself.employed  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ jobservices       : num  0 0 0 0 0 0 0 0 0 0 ...
 $ jobstudent        : num  0 0 0 0 0 0 0 0 0 0 ...
 $ jobtechnician     : num  0 1 0 0 0 0 0 0 0 1 ...
 $ jobunemployed     : num  0 0 0 0 0 0 0 0 0 0 ...
 $ jobunknown        : num  0 0 0 0 1 0 0 0 0 0 ...
 $ maritalmarried    : num  1 0 1 1 0 1 0 0 1 0 ...
 $ maritalsingle     : num  0 1 0 0 1 0 1 0 0 1 ...
 $ educationsecondary: num  0 1 1 0 0 0 0 0 0 1 ...
 $ educationtertiary : num  1 0 0 0 0 1 1 1 0 0 ...
 $ educationunknown  : num  0 0 0 1 1 0 0 0 0 0 ...
 $ defaultyes        : num  0 0 0 0 0 0 0 1 0 0 ...
 $ balance           : num  2143 29 2 1506 1 ...
 $ housingyes        : num  1 1 1 1 0 1 1 1 1 1 ...
 $ loanyes           : num  0 0 1 0 0 0 1 0 0 0 ...
 $ contacttelephone  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ contactunknown    : num  1 1 1 1 1 1 1 1 1 1 ...
 $ day               : num  5 5 5 5 5 5 5 5 5 5 ...
 $ monthaug          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ monthdec          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ monthfeb          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ monthjan          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ monthjul          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ monthjun          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ monthmar          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ monthmay          : num  1 1 1 1 1 1 1 1 1 1 ...
 $ monthnov          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ monthoct          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ monthsep          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ duration          : num  261 151 76 92 198 139 217 380 50 55 ...
 $ campaign          : num  1 1 1 1 1 1 1 1 1 1 ...
 $ pdays             : num  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
 $ previous          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ poutcomeother     : num  0 0 0 0 0 0 0 0 0 0 ...
 $ poutcomesuccess   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ poutcomeunknown   : num  1 1 1 1 1 1 1 1 1 1 ...
 $ y                 : num  0 0 0 0 0 0 0 0 0 0 ...
#==================================================================
#Converting the dependent variable back to categorical
#==================================================================
transformed$y<-as.factor(transformed$y)
#==================================================================
#Spliting training set into two parts based on outcome: 70% and 30%
#==================================================================
index<-createDataPartition(transformed$y,p=0.70, list=FALSE)
trainSet<-transformed[index,]
testSet<-transformed[-index,]
outcomeName<-'y'
predictors<-names(trainSet)[!names(trainSet) %in% outcomeName]
predictors
 [1] "age"                "jobblue.collar"     "jobentrepreneur"    "jobhousemaid"      
 [5] "jobmanagement"      "jobretired"         "jobself.employed"   "jobservices"       
 [9] "jobstudent"         "jobtechnician"      "jobunemployed"      "jobunknown"        
[13] "maritalmarried"     "maritalsingle"      "educationsecondary" "educationtertiary" 
[17] "educationunknown"   "defaultyes"         "balance"            "housingyes"        
[21] "loanyes"            "contacttelephone"   "contactunknown"     "day"               
[25] "monthaug"           "monthdec"           "monthfeb"           "monthjan"          
[29] "monthjul"           "monthjun"           "monthmar"           "monthmay"          
[33] "monthnov"           "monthoct"           "monthsep"           "duration"          
[37] "campaign"           "pdays"              "previous"           "poutcomeother"     
[41] "poutcomesuccess"    "poutcomeunknown"   
#==================================================================
#Feature selection using rfe in caret(recursive feature extraction)
#predictors<-names(trainSet)[!names(trainSet) %in% outcomeName]
#Alternatively
#predictors<-setdiff(names(trainSet),outcomeName)
#==================================================================
library(randomForest)
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.

Attaching package: ‘randomForest’

The following object is masked from ‘package:Hmisc’:

    combine

The following object is masked from ‘package:ggplot2’:

    margin

The following object is masked from ‘package:dplyr’:

    combine

Warning messages:
1: closing unused connection 9 (<-localhost:11966) 
2: closing unused connection 8 (<-localhost:11966) 
3: closing unused connection 7 (<-localhost:11966) 
4: closing unused connection 6 (<-localhost:11966) 
5: closing unused connection 5 (<-localhost:11966) 
6: closing unused connection 4 (<-localhost:11966) 
7: closing unused connection 3 (<-localhost:11966) 
control <- rfeControl(functions = rfFuncs,
                      method = "repeatedcv",
                      repeats = 3,
                      verbose = FALSE)
outcomeName<-'y'
predictors<-names(trainSet)[!names(trainSet) %in% outcomeName]
feature_select <- rfe(trainSet[,predictors], trainSet[,outcomeName],
                         rfeControl = control)
#save(feature_select,file="feature_select.RData")
 
load("feature_select.RData")
#The top 5 variables (out of 42):
#print("The top 5 variables (out of 42)\n")
cat("The top 5 variables (out of 42)\n")
The top 5 variables (out of 42)
cat("duration, poutcomesuccess, monthmar, contactunknown, housingyes\n")   
duration, poutcomesuccess, monthmar, contactunknown, housingyes
predictors(feature_select)
 [1] "duration"           "poutcomesuccess"    "monthmar"           "contactunknown"    
 [5] "housingyes"         "monthoct"           "age"                "day"               
 [9] "monthsep"           "pdays"              "monthjul"           "monthdec"          
[13] "monthaug"           "monthmay"           "monthfeb"           "monthjun"          
[17] "previous"           "monthnov"           "poutcomeunknown"    "monthjan"          
[21] "campaign"           "loanyes"            "maritalmarried"     "educationtertiary" 
[25] "maritalsingle"      "balance"            "jobblue.collar"     "contacttelephone"  
[29] "jobmanagement"      "jobstudent"         "jobservices"        "poutcomeother"     
[33] "defaultyes"         "jobretired"         "educationsecondary" "educationunknown"  
[37] "jobtechnician"      "jobhousemaid"       "jobunknown"         "jobentrepreneur"   
[41] "jobself.employed"   "jobunemployed"     
#===================================================================================
# plot variable selection
#===================================================================================
trellis.par.set(caretTheme())
Note: The default device has been opened to honour attempt to modify trellis settings
plot(feature_select, type = c( "o","g"))

About 8(14 variables ) features provides the optimal accuracy for training.

The top 5 variables provides an accuracy of about 90% for the data. The remaining 36 variables add less than 0.1 . This is the advantage of feature engineering. It helps to reduce complexity in the model, reduce overfitting and also computationaly time.

#===================================================================================
#Taking only the top 5 predictors
#Age, Employment.Primarily.retired..or, Education.10th.Grade, 
#Employment.Unable.to.work., race_black.Yes from feature_select
# Cs function from Hmisc converts names to character variables
#===================================================================================
library(Hmisc)
p=c(paste0(predictors(feature_select),sep=",",collapse = ""))
#trainSet[,]
predictor=Hmisc::Cs(duration,poutcome.success,month.mar,contact.unknown,age,housing.yes,day,month.oct,pdays,month.sep,month.jul,month.dec,month.may,month.aug,campaign,month.jun )
print("The set of features selected is:\n") 
[1] "The set of features selected is:\n"
p
[1] "duration,poutcomesuccess,monthmar,contactunknown,housingyes,monthoct,age,day,monthsep,pdays,monthjul,monthdec,monthaug,monthmay,monthfeb,monthjun,previous,monthnov,poutcomeunknown,monthjan,campaign,loanyes,maritalmarried,educationtertiary,maritalsingle,balance,jobblue.collar,contacttelephone,jobmanagement,jobstudent,jobservices,poutcomeother,defaultyes,jobretired,educationsecondary,educationunknown,jobtechnician,jobhousemaid,jobunknown,jobentrepreneur,jobself.employed,jobunemployed,"
trainSet[,outcomeName]%>%head()
[1] 0 0 0 0 0 0
Levels: 0 1
class(trainSet[,outcomeName])
[1] "factor"
Topfivepred = Hmisc::Cs(duration, poutcomesuccess, monthmar, contactunknown, housingyes) 
trainSet[,Topfivepred]%>%head()

Logistic Regression Model

model_glm<-train(trainSet[,Topfivepred],as.factor(trainSet$y),method='glm',family="binomial")
summary(model_glm)

Call:
NULL

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-5.6320  -0.4069  -0.2862  -0.1763   3.1529  

Coefficients:
                  Estimate Std. Error z value Pr(>|z|)    
(Intercept)     -2.989e+00  3.847e-02  -77.69   <2e-16 ***
duration         3.994e-03  7.338e-05   54.42   <2e-16 ***
poutcomesuccess  2.617e+00  7.374e-02   35.49   <2e-16 ***
monthmar         2.175e+00  1.258e-01   17.29   <2e-16 ***
contactunknown  -1.301e+00  6.729e-02  -19.34   <2e-16 ***
housingyes      -7.931e-01  4.373e-02  -18.14   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 22845  on 31648  degrees of freedom
Residual deviance: 16233  on 31643  degrees of freedom
AIC: 16245

Number of Fisher Scoring iterations: 6

The accuracy of the logistic regression model about 90.4 %

#save(model_glm,file="model_glm.RData")
 
load("model_glm.RData")
 
 
# Predict using the test data
pred<-predict(model_glm,testSet)
my_data=data.frame(cbind(predicted=pred,observed=testSet$y))
ggplot(my_data,aes(predicted,observed))+geom_point()+geom_smooth(method=lm)+ggtitle('logistic model')

# Print, plot variable importance
print(varImp(model_glm, scale = FALSE))
glm variable importance

                Overall
duration          54.92
poutcomesuccess   36.58
contactunknown    18.68
housingyes        17.54
monthmar          16.91
plot(varImp(model_glm, scale = FALSE), main="Variable Importance using logistic/glm")

confusionMatrix(testSet$y,pred)
Confusion Matrix and Statistics

          Reference
Prediction     0     1
         0 11704   272
         1  1063   523
                                          
               Accuracy : 0.9016          
                 95% CI : (0.8964, 0.9065)
    No Information Rate : 0.9414          
    P-Value [Acc > NIR] : 1               
                                          
                  Kappa : 0.3918          
 Mcnemar's Test P-Value : <2e-16          
                                          
            Sensitivity : 0.9167          
            Specificity : 0.6579          
         Pos Pred Value : 0.9773          
         Neg Pred Value : 0.3298          
             Prevalence : 0.9414          
         Detection Rate : 0.8630          
   Detection Prevalence : 0.8831          
      Balanced Accuracy : 0.7873          
                                          
       'Positive' Class : 0               
                                          
#==================================================================
#ROCR Curve
#==================================================================
library(pROC)
#install.packages("pROC")
# Compute AUC for predicting Class with the variable CreditHistory.Critical
f1 = roc(Data$y ~ Data$duration, data=trainSet) 
plot(f1, col="red")
p=ggplot(trainSet, aes(d = y, m = duration)) + geom_roc()+ style_roc()
plot_interactive_roc((p))

#Draw the ROC curve 
glm.probs <- predict(model_glm,testSet,type="prob")
head(glm.probs)
 
glm.ROC <- roc(predictor=glm.probs$PS,
               response=testSet$y,
               levels=rev(levels(testSet$y)))
Error in roc.default(predictor = glm.probs$PS, response = testSet$y, levels = rev(levels(testSet$y))) : 
  No valid data provided.
#===================================================================================
# multiple algorithms
#===================================================================================
fitControl <- trainControl(
  method = "repeatedcv",
  number = 5,classProbs = TRUE,
  repeats = 5)
met=c("LogitBoost", 'xgbTree', 'rf', 'svmRadial')
re=list()
for (i in seq_along(met)) {
  
 re[[i]]<-train(trainSet[,Topfivepred],trainSet[,outcomeName],method=met[i],
                
                    preProcess = c("center","scale"))  
}

Save models to avoid running the model every time.

save(re,file="multiplealgorithms.RData")

Boosted Logistic Regression

re[[1]]%>%head()
$method
[1] "LogitBoost"

$modelInfo
$modelInfo$label
[1] "Boosted Logistic Regression"

$modelInfo$library
[1] "caTools"

$modelInfo$loop
function (grid) 
{
    loop <- grid[which.max(grid$nIter), , drop = FALSE]
    submodels <- grid[-which.max(grid$nIter), , drop = FALSE]
    submodels <- list(submodels)
    list(loop = loop, submodels = submodels)
}

$modelInfo$type
[1] "Classification"

$modelInfo$parameters

$modelInfo$grid
function (x, y, len = NULL, search = "grid") 
{
    if (search == "grid") {
        out <- data.frame(nIter = 1 + ((1:len) * 10))
    }
    else {
        out <- data.frame(nIter = unique(sample(1:100, size = len, 
            replace = TRUE)))
    }
    out
}

$modelInfo$fit
function (x, y, wts, param, lev, last, classProbs, ...) 
{
    caTools::LogitBoost(as.matrix(x), y, nIter = param$nIter)
}

$modelInfo$predict
function (modelFit, newdata, submodels = NULL) 
{
    out <- caTools::predict.LogitBoost(modelFit, newdata, type = "class")
    if (!is.null(submodels)) {
        tmp <- out
        out <- vector(mode = "list", length = nrow(submodels) + 
            1)
        out[[1]] <- tmp
        for (j in seq(along = submodels$nIter)) {
            out[[j + 1]] <- caTools::predict.LogitBoost(modelFit, 
                newdata, nIter = submodels$nIter[j])
        }
    }
    out
}

$modelInfo$prob
function (modelFit, newdata, submodels = NULL) 
{
    out <- caTools::predict.LogitBoost(modelFit, newdata, type = "raw")
    out <- t(apply(out, 1, function(x) x/sum(x)))
    if (!is.null(submodels)) {
        tmp <- vector(mode = "list", length = nrow(submodels) + 
            1)
        tmp[[1]] <- out
        for (j in seq(along = submodels$nIter)) {
            tmpProb <- caTools::predict.LogitBoost(modelFit, 
                newdata, type = "raw", nIter = submodels$nIter[j])
            tmpProb <- out <- t(apply(tmpProb, 1, function(x) x/sum(x)))
            tmp[[j + 1]] <- as.data.frame(tmpProb[, modelFit$obsLevels, 
                drop = FALSE])
        }
        out <- tmp
    }
    out
}

$modelInfo$predictors
function (x, ...) 
{
    if (!is.null(x$xNames)) {
        out <- unique(x$xNames[x$Stump[, "feature"]])
    }
    else out <- NA
    out
}

$modelInfo$levels
function (x) 
x$obsLevels

$modelInfo$tags
[1] "Ensemble Model"             "Boosting"                   "Implicit Feature Selection"
[4] "Tree-Based Model"           "Logistic Regression"       

$modelInfo$sort
function (x) 
x[order(x[, 1]), ]


$modelType
[1] "Classification"

$results

$pred
NULL

$bestTune
re[[1]][[4]]%>%kable()

nIter Accuracy Kappa AccuracySD KappaSD
11 0.8954207 0.3578465 0.0039172 0.0544693
21 0.8935720 0.3555484 0.0041170 0.0617421
31 0.8940555 0.3667228 0.0052416 0.0519837

plot(re[[1]])

varImp(object=re[[1]])
ROC curve variable importance

                Importance
duration            100.00
housingyes           31.53
contactunknown       30.61
poutcomesuccess      22.39
monthmar              0.00
# Predict using the test data
pred<-predict(re[[1]],testSet)
my_data=data.frame(cbind(predicted=pred,observed=testSet$y))
ggplot(my_data,aes(predicted,observed))+geom_point()+geom_smooth(method=lm)+ggtitle('Boosted Logistic Regression')

# Print, plot variable importance
print(varImp(re[[1]], scale = FALSE))
ROC curve variable importance

                Importance
duration            0.8045
housingyes          0.6103
contactunknown      0.6077
poutcomesuccess     0.5844
monthmar            0.5209
plot(varImp(re[[1]], scale = FALSE), main="Boosted Logistic Regression")

confusionMatrix(testSet$y,pred)
Confusion Matrix and Statistics

          Reference
Prediction     0     1
         0 11370   606
         1  1585     1
                                          
               Accuracy : 0.8384          
                 95% CI : (0.8321, 0.8446)
    No Information Rate : 0.9552          
    P-Value [Acc > NIR] : 1               
                                          
                  Kappa : -0.0682         
 Mcnemar's Test P-Value : <2e-16          
                                          
            Sensitivity : 0.8776534       
            Specificity : 0.0016474       
         Pos Pred Value : 0.9493988       
         Neg Pred Value : 0.0006305       
             Prevalence : 0.9552426       
         Detection Rate : 0.8383719       
   Detection Prevalence : 0.8830556       
      Balanced Accuracy : 0.4396504       
                                          
       'Positive' Class : 0               
                                          

EXtreme Gradient Boosting

#re[[2]]
re[[2]][[4]]%>%head()%>%kable()

eta max_depth gamma colsample_bytree min_child_weight subsample nrounds Accuracy Kappa AccuracySD KappaSD
1 0.3 1 0 0.6 1 0.50 50 0.8999284 0.3848753 0.0023361 0.0209991
4 0.3 1 0 0.6 1 0.75 50 0.8999285 0.3836727 0.0023861 0.0257059
7 0.3 1 0 0.6 1 1.00 50 0.8998351 0.3733176 0.0024587 0.0317696
10 0.3 1 0 0.8 1 0.50 50 0.8998809 0.3832630 0.0026702 0.0178440
13 0.3 1 0 0.8 1 0.75 50 0.9000034 0.3795111 0.0028604 0.0194459
16 0.3 1 0 0.8 1 1.00 50 0.8999448 0.3766056 0.0029003 0.0150799

plot(re[[2]])

varImp(object=re[[2]])
xgbTree variable importance

                Overall
duration        100.000
poutcomesuccess  34.165
contactunknown   22.346
monthmar          7.693
housingyes        0.000
# Predict using the test data
pred<-predict(re[[2]],testSet)
my_data=data.frame(cbind(predicted=pred,observed=testSet$y))
ggplot(my_data,aes(predicted,observed))+geom_point()+geom_smooth(method=lm)+ggtitle('EXtreme Gradient Boosting')

# Print, plot variable importance
print(varImp(re[[2]], scale = FALSE))
xgbTree variable importance

                Overall
duration        0.60900
poutcomesuccess 0.20806
contactunknown  0.13609
monthmar        0.04685
housingyes      0.00000
plot(varImp(re[[2]], scale = FALSE), main="EXtreme Gradient Boosting")

confusionMatrix(testSet$y,pred)
Confusion Matrix and Statistics

          Reference
Prediction     0     1
         0 11956    20
         1  1574    12
                                          
               Accuracy : 0.8825          
                 95% CI : (0.8769, 0.8878)
    No Information Rate : 0.9976          
    P-Value [Acc > NIR] : 1               
                                          
                  Kappa : 0.0103          
 Mcnemar's Test P-Value : <2e-16          
                                          
            Sensitivity : 0.883666        
            Specificity : 0.375000        
         Pos Pred Value : 0.998330        
         Neg Pred Value : 0.007566        
             Prevalence : 0.997640        
         Detection Rate : 0.881581        
   Detection Prevalence : 0.883056        
      Balanced Accuracy : 0.629333        
                                          
       'Positive' Class : 0               
                                          

Random Forest

re[[3]]
Random Forest 

31649 samples
    5 predictor
    2 classes: '0', '1' 

Pre-processing: centered (5), scaled (5) 
Resampling: Bootstrapped (25 reps) 
Summary of sample sizes: 31649, 31649, 31649, 31649, 31649, 31649, ... 
Resampling results across tuning parameters:

  mtry  Accuracy   Kappa    
  2     0.9004525  0.3781019
  3     0.9006479  0.3986243
  5     0.8831291  0.3440634

Accuracy was used to select the optimal model using  the largest value.
The final value used for the model was mtry = 3.
re[[3]][[4]]%>%kable()

mtry Accuracy Kappa AccuracySD KappaSD
2 0.9004525 0.3781019 0.0019847 0.0127565
3 0.9006479 0.3986243 0.0023334 0.0132605
5 0.8831291 0.3440634 0.0023667 0.0104566

plot(re[[3]])

varImp(object=re[[3]])
rf variable importance

                Overall
duration        100.000
poutcomesuccess  37.522
housingyes        1.609
contactunknown    1.115
monthmar          0.000
# Predict using the test data
pred<-predict(re[[3]],testSet)
my_data=data.frame(cbind(predicted=pred,observed=testSet$y))
ggplot(my_data,aes(predicted,observed))+geom_point()+geom_smooth(method=lm)+ggtitle('Random Forest')

# Print, plot variable importance
print(varImp(re[[3]], scale = FALSE))
rf variable importance

                Overall
duration        1331.47
poutcomesuccess  561.86
housingyes       119.47
contactunknown   113.39
monthmar          99.65
plot(varImp(re[[3]], scale = FALSE), main="Random Forest")

confusionMatrix(testSet$y,pred)
Confusion Matrix and Statistics

          Reference
Prediction     0     1
         0 11682   294
         1  1000   586
                                          
               Accuracy : 0.9046          
                 95% CI : (0.8995, 0.9095)
    No Information Rate : 0.9351          
    P-Value [Acc > NIR] : 1               
                                          
                  Kappa : 0.4275          
 Mcnemar's Test P-Value : <2e-16          
                                          
            Sensitivity : 0.9211          
            Specificity : 0.6659          
         Pos Pred Value : 0.9755          
         Neg Pred Value : 0.3695          
             Prevalence : 0.9351          
         Detection Rate : 0.8614          
   Detection Prevalence : 0.8831          
      Balanced Accuracy : 0.7935          
                                          
       'Positive' Class : 0               
                                          

Support Vector Machines with Radial Basis Function Kernel

fitControl <- trainControl(method = "repeatedcv",number = 5,repeats = 5,
                           allowParallel = TRUE )
model_svm<-train(trainSet[,Topfivepred],trainSet[,outcomeName],method='svmRadial',
                  trControl=fitControl)
#predictions<-predict.train(object=re[[4]],testSet[,predictors],type="raw")
#predictions
model_svm
Support Vector Machines with Radial Basis Function Kernel 

31649 samples
    5 predictor
    2 classes: '0', '1' 

No pre-processing
Resampling: Cross-Validated (5 fold, repeated 5 times) 
Summary of sample sizes: 25319, 25319, 25319, 25320, 25319, 25318, ... 
Resampling results across tuning parameters:

  C     Accuracy   Kappa    
  0.25  0.9014246  0.4084901
  0.50  0.9011237  0.4078208
  1.00  0.9011023  0.4077365

Tuning parameter 'sigma' was held constant at a value of 3.884677
Accuracy was used to select the optimal model using  the largest value.
The final values used for the model were sigma = 3.884677 and C = 0.25.
save(model_svm,file="model_svm.RData")
 
load("model_svm.RData")
plot(model_svm)

varImp(object=model_svm)
ROC curve variable importance

                Importance
duration            100.00
housingyes           31.41
contactunknown       30.32
poutcomesuccess      23.13
monthmar              0.00
# Predict using the test data
#pred<-predict(model_svm,testSet)
predictions<-predict.train(object=model_svm,testSet[,Topfivepred],type="raw")
my_data=data.frame(cbind(predicted=predictions,observed=testSet$y))
ggplot(my_data,aes(predicted,observed))+geom_point()+geom_smooth(method=lm)+ggtitle('Support Vector Machines with Radial Basis Function Kernel')

# Print, plot variable importance
print(varImp(model_svm, scale = FALSE))
ROC curve variable importance

                Importance
duration            0.8078
housingyes          0.6093
contactunknown      0.6062
poutcomesuccess     0.5854
monthmar            0.5185
plot(varImp(model_svm, scale = FALSE), main="Support Vector Machines with Radial Basis Function Kernel")

confusionMatrix(testSet$y,predictions)
Confusion Matrix and Statistics

          Reference
Prediction     0     1
         0 11657   319
         1   996   590
                                         
               Accuracy : 0.903          
                 95% CI : (0.8979, 0.908)
    No Information Rate : 0.933          
    P-Value [Acc > NIR] : 1              
                                         
                  Kappa : 0.4239         
 Mcnemar's Test P-Value : <2e-16         
                                         
            Sensitivity : 0.9213         
            Specificity : 0.6491         
         Pos Pred Value : 0.9734         
         Neg Pred Value : 0.3720         
             Prevalence : 0.9330         
         Detection Rate : 0.8595         
   Detection Prevalence : 0.8831         
      Balanced Accuracy : 0.7852         
                                         
       'Positive' Class : 0              
                                         
predictions%>%head()
[1] 0 0 0 0 0 0
Levels: 0 1

Find models that are supported by the caret package. There are over 200 models that can be implemented in the caret package at last count.

names(getModelInfo())
  [1] "ada"                 "AdaBag"              "AdaBoost.M1"         "adaboost"           
  [5] "amdai"               "ANFIS"               "avNNet"              "awnb"               
  [9] "awtan"               "bag"                 "bagEarth"            "bagEarthGCV"        
 [13] "bagFDA"              "bagFDAGCV"           "bam"                 "bartMachine"        
 [17] "bayesglm"            "binda"               "blackboost"          "blasso"             
 [21] "blassoAveraged"      "bridge"              "brnn"                "BstLm"              
 [25] "bstSm"               "bstTree"             "C5.0"                "C5.0Cost"           
 [29] "C5.0Rules"           "C5.0Tree"            "cforest"             "chaid"              
 [33] "CSimca"              "ctree"               "ctree2"              "cubist"             
 [37] "dda"                 "deepboost"           "DENFIS"              "dnn"                
 [41] "dwdLinear"           "dwdPoly"             "dwdRadial"           "earth"              
 [45] "elm"                 "enet"                "evtree"              "extraTrees"         
 [49] "fda"                 "FH.GBML"             "FIR.DM"              "foba"               
 [53] "FRBCS.CHI"           "FRBCS.W"             "FS.HGD"              "gam"                
 [57] "gamboost"            "gamLoess"            "gamSpline"           "gaussprLinear"      
 [61] "gaussprPoly"         "gaussprRadial"       "gbm_h2o"             "gbm"                
 [65] "gcvEarth"            "GFS.FR.MOGUL"        "GFS.GCCL"            "GFS.LT.RS"          
 [69] "GFS.THRIFT"          "glm.nb"              "glm"                 "glmboost"           
 [73] "glmnet_h2o"          "glmnet"              "glmStepAIC"          "gpls"               
 [77] "hda"                 "hdda"                "hdrda"               "HYFIS"              
 [81] "icr"                 "J48"                 "JRip"                "kernelpls"          
 [85] "kknn"                "knn"                 "krlsPoly"            "krlsRadial"         
 [89] "lars"                "lars2"               "lasso"               "lda"                
 [93] "lda2"                "leapBackward"        "leapForward"         "leapSeq"            
 [97] "Linda"               "lm"                  "lmStepAIC"           "LMT"                
[101] "loclda"              "logicBag"            "LogitBoost"          "logreg"             
[105] "lssvmLinear"         "lssvmPoly"           "lssvmRadial"         "lvq"                
[109] "M5"                  "M5Rules"             "manb"                "mda"                
[113] "Mlda"                "mlp"                 "mlpML"               "mlpSGD"             
[117] "mlpWeightDecay"      "mlpWeightDecayML"    "monmlp"              "msaenet"            
[121] "multinom"            "naive_bayes"         "nb"                  "nbDiscrete"         
[125] "nbSearch"            "neuralnet"           "nnet"                "nnls"               
[129] "nodeHarvest"         "oblique.tree"        "OneR"                "ordinalNet"         
[133] "ORFlog"              "ORFpls"              "ORFridge"            "ORFsvm"             
[137] "ownn"                "pam"                 "parRF"               "PART"               
[141] "partDSA"             "pcaNNet"             "pcr"                 "pda"                
[145] "pda2"                "penalized"           "PenalizedLDA"        "plr"                
[149] "pls"                 "plsRglm"             "polr"                "ppr"                
[153] "PRIM"                "protoclass"          "pythonKnnReg"        "qda"                
[157] "QdaCov"              "qrf"                 "qrnn"                "randomGLM"          
[161] "ranger"              "rbf"                 "rbfDDA"              "Rborist"            
[165] "rda"                 "regLogistic"         "relaxo"              "rf"                 
[169] "rFerns"              "RFlda"               "rfRules"             "ridge"              
[173] "rlda"                "rlm"                 "rmda"                "rocc"               
[177] "rotationForest"      "rotationForestCp"    "rpart"               "rpart1SE"           
[181] "rpart2"              "rpartCost"           "rpartScore"          "rqlasso"            
[185] "rqnc"                "RRF"                 "RRFglobal"           "rrlda"              
[189] "RSimca"              "rvmLinear"           "rvmPoly"             "rvmRadial"          
[193] "SBC"                 "sda"                 "sdwd"                "simpls"             
[197] "SLAVE"               "slda"                "smda"                "snn"                
[201] "sparseLDA"           "spikeslab"           "spls"                "stepLDA"            
[205] "stepQDA"             "superpc"             "svmBoundrangeString" "svmExpoString"      
[209] "svmLinear"           "svmLinear2"          "svmLinear3"          "svmLinearWeights"   
[213] "svmLinearWeights2"   "svmPoly"             "svmRadial"           "svmRadialCost"      
[217] "svmRadialSigma"      "svmRadialWeights"    "svmSpectrumString"   "tan"                
[221] "tanSearch"           "treebag"             "vbmpRadial"          "vglmAdjCat"         
[225] "vglmContRatio"       "vglmCumulative"      "widekernelpls"       "WM"                 
[229] "wsrf"                "xgbLinear"           "xgbTree"             "xyf"                
#===================================================================================
#gradient boosted trees
# parameter tuning
#===================================================================================


fitControl <- trainControl(method = "repeatedcv",number = 5,repeats = 5,
                           allowParallel = TRUE )




#fitControl <- trainControl(
#  method = "repeatedcv",
#  number = 5,classProbs = TRUE,
#  repeats = 5,allowParallel = TRUE)

#Creating grid
grid <- expand.grid(n.trees=c(10,20,50,100,500,1000),shrinkage=c(0.01,0.05,0.1,0.5),n.minobsinnode = c(3,5,10),interaction.depth=c(1,5,10))
# training the model
model_gbm<-train(trainSet[,Topfivepred],trainSet[,outcomeName],method='gbm',trControl=fitControl,tuneGrid=grid)
# summarizing the model
print(model_gbm)
modelLookup(model='gbm')
#model_gbm%>%as.list.data.frame()%>%kable()
save(model_gbm,file="model_gbm.RData")
 
load("model_gbm.RData")
 
 
model_gbm$bestTune%>%kable()

n.trees interaction.depth shrinkage n.minobsinnode
54 1000 10 0.01 10

 
model_gbm$results%>%head()%>%kable()

shrinkage interaction.depth n.minobsinnode n.trees Accuracy Kappa AccuracySD KappaSD
1 0.01 1 3 10 0.8829979 0 7.02e-05 0
7 0.01 1 5 10 0.8829979 0 7.02e-05 0
13 0.01 1 10 10 0.8829979 0 7.02e-05 0
55 0.05 1 3 10 0.8829979 0 7.02e-05 0
61 0.05 1 5 10 0.8829979 0 7.02e-05 0
67 0.05 1 10 10 0.8829979 0 7.02e-05 0

#various of finding the row with maximum accuracy
model_gbm$results[which.max(model_gbm$results$Accuracy),]   
    
    
model_gbm$results%>%filter()%>%dplyr::summarise(max1=max(Accuracy))
    
    
model_gbm$results %>% dplyr::slice(which.min(Accuracy ))
    
model_gbm$results%>%dplyr::slice(which.max(Accuracy ))
    
model_gbm$results[ which(model_gbm$results$Accuracy ==max(model_gbm$results$Accuracy))  ,]
plot(model_gbm)

# 
# pred<-predict(model_gbm,iris_test)
# 
# Conf_matrix<-confusionMatrix(pred,iris[1:5,5])
# 
# kable(Conf_matrix$table)
#using tune length
fitControl <- trainControl(method = "repeatedcv",number = 5,repeats = 5,
                           allowParallel = TRUE )
model_gbm2<-train(trainSet[,Topfivepred],trainSet[,outcomeName],method='gbm',
                  trControl=fitControl)

We can use tuneLength instead of specifying the value of each parameter. This allows any number of possible values for each tuning parameter through tuneLength.

# model_gbm3<-train(trainSet[,Topfivepred],trainSet[,outcomeName],method='gbm',
#                   trControl=fitControl,interaction.depth=10,n.trees=100,n.minobsinnode=10)
# print(model_gbm)
save(model_gbm2,file="model_gbm2.RData")
 
 load("model_gbm2.RData")
print(model_gbm2)
Stochastic Gradient Boosting 

31649 samples
    5 predictor
    2 classes: '0', '1' 

No pre-processing
Resampling: Cross-Validated (5 fold, repeated 5 times) 
Summary of sample sizes: 25319, 25319, 25319, 25319, 25320, 25319, ... 
Resampling results across tuning parameters:

  interaction.depth  n.trees  Accuracy   Kappa    
  1                   50      0.8964577  0.2749744
  1                  100      0.9012330  0.4017715
  1                  150      0.9022024  0.4202335
  2                   50      0.9018471  0.4158550
  2                  100      0.9023276  0.4321845
  2                  150      0.9025777  0.4319439
  3                   50      0.9022814  0.4301685
  3                  100      0.9021306  0.4317884
  3                  150      0.9020157  0.4264921

Tuning parameter 'shrinkage' was held constant at a value of 0.1
Tuning parameter 'n.minobsinnode'
 was held constant at a value of 10
Accuracy was used to select the optimal model using  the largest value.
The final values used for the model were n.trees = 150, interaction.depth = 2, shrinkage = 0.1
 and n.minobsinnode = 10.
plot(model_gbm2)

varImp(object=model_gbm2)
gbm variable importance

                Overall
duration        100.000
poutcomesuccess  40.234
contactunknown    2.495
housingyes        2.389
monthmar          0.000
# Predict using the test data
pred<-predict(model_gbm2,testSet)
my_data=data.frame(cbind(predicted=pred,observed=testSet$y))
ggplot(my_data,aes(predicted,observed))+geom_point()+geom_smooth(method=lm)+ggtitle('Stochastic Gradient Boosting Machine')

# Print, plot variable importance
print(varImp(model_gbm2, scale = FALSE))
gbm variable importance

                Overall
duration         1576.5
poutcomesuccess   707.1
contactunknown    158.2
housingyes        156.6
monthmar          121.9
plot(varImp(model_gbm2, scale = FALSE), main="Stochastic Gradient Boosting")

confusionMatrix(testSet$y,pred)
Confusion Matrix and Statistics

          Reference
Prediction     0     1
         0 11976     0
         1  1586     0
                                          
               Accuracy : 0.8831          
                 95% CI : (0.8775, 0.8884)
    No Information Rate : 1               
    P-Value [Acc > NIR] : 1               
                                          
                  Kappa : 0               
 Mcnemar's Test P-Value : <2e-16          
                                          
            Sensitivity : 0.8831          
            Specificity :     NA          
         Pos Pred Value :     NA          
         Neg Pred Value :     NA          
             Prevalence : 1.0000          
         Detection Rate : 0.8831          
   Detection Prevalence : 0.8831          
      Balanced Accuracy :     NA          
                                          
       'Positive' Class : 0               
                                          

The maximum accuracy of 0.9059 occurs at these parameter combinations shrinkage=0.01,interaction.depth=10,n.minobsinnode=10 and n.trees=1000. The mew model will be fitted with these parameter values.

Neural networks

model_nnet<-train(trainSet[,Topfivepred],trainSet[,outcomeName],method="nnet",trControl=fitControl)
save(model_nnet,file="model_nnet.RData")
 
load("model_nnet.RData")
print(model_nnet)
Neural Network 

31649 samples
    5 predictor
    2 classes: '0', '1' 

No pre-processing
Resampling: Cross-Validated (5 fold, repeated 5 times) 
Summary of sample sizes: 25319, 25319, 25319, 25320, 25319, 25320, ... 
Resampling results across tuning parameters:

  size  decay  Accuracy   Kappa    
  1     0e+00  0.8888624  0.1414151
  1     1e-04  0.8901819  0.1786869
  1     1e-01  0.8933553  0.2594665
  3     0e+00  0.8977597  0.3529213
  3     1e-04  0.8971071  0.3353283
  3     1e-01  0.9027078  0.4248171
  5     0e+00  0.9004840  0.3994720
  5     1e-04  0.9004154  0.3905834
  5     1e-01  0.9027372  0.4255974

Accuracy was used to select the optimal model using  the largest value.
The final values used for the model were size = 5 and decay = 0.1.
plot(model_nnet)

varImp(object=model_nnet)
nnet variable importance

                Overall
contactunknown   100.00
poutcomesuccess   88.71
duration          63.52
housingyes        17.35
monthmar           0.00
# Predict using the test data
pred<-predict(model_nnet,testSet)
my_data=data.frame(cbind(predicted=pred,observed=testSet$y))
ggplot(my_data,aes(predicted,observed))+geom_point()+geom_smooth(method=lm)+ggtitle('Stochastic Gradient Boosting Machine')

# Print, plot variable importance
print(varImp(model_nnet, scale = FALSE))
nnet variable importance

                Overall
contactunknown   30.485
poutcomesuccess  27.916
duration         22.185
housingyes       11.681
monthmar          7.732
plot(varImp(model_nnet, scale = FALSE), main="Stochastic Gradient Boosting")

confusionMatrix(testSet$y,pred)
Confusion Matrix and Statistics

          Reference
Prediction     0     1
         0 11654   322
         1   991   595
                                          
               Accuracy : 0.9032          
                 95% CI : (0.8981, 0.9081)
    No Information Rate : 0.9324          
    P-Value [Acc > NIR] : 1               
                                          
                  Kappa : 0.4263          
 Mcnemar's Test P-Value : <2e-16          
                                          
            Sensitivity : 0.9216          
            Specificity : 0.6489          
         Pos Pred Value : 0.9731          
         Neg Pred Value : 0.3752          
             Prevalence : 0.9324          
         Detection Rate : 0.8593          
   Detection Prevalence : 0.8831          
      Balanced Accuracy : 0.7852          
                                          
       'Positive' Class : 0               
                                          
stopImplicitCluster()
LS0tCnRpdGxlOiAiUHJlZGljdGluZyBCYW5rIGxvYW4gIFRlcm0gRGVwb3NpdCIKYXV0aG9yOiAiTmFuYSBCb2F0ZW5nIgpvdXRwdXQ6CiAgaHRtbF9kb2N1bWVudDoKICAgIGRmX3ByaW50OiBwYWdlZAotLS0KClRoZSBkYXRhIGlzIHJlbGF0ZWQgd2l0aCBkaXJlY3QgbWFya2V0aW5nIGNhbXBhaWducyAocGhvbmUgY2FsbHMpIG9mIGEgUG9ydHVndWVzZSBiYW5raW5nIGluc3RpdHV0aW9uLiBUaGUgY2xhc3NpZmljYXRpb24gZ29hbCBpcyB0byBwcmVkaWN0IGlmIHRoZSBjbGllbnQgd2lsbCBzdWJzY3JpYmUgYSB0ZXJtIGRlcG9zaXQgKHZhcmlhYmxlIHkpLgoKCkRhdGEgYXZhaWxhYmxlIGF0IFVDSSBtYWNoaW5lIGxlYXJuaW5nIHJlcG9zaXRvcnkgW2hlcmVdKGh0dHBzOi8vYXJjaGl2ZS5pY3MudWNpLmVkdS9tbC9kYXRhc2V0cy9iYW5rK21hcmtldGluZykgCgpDaXRhdGlvbiBSZXF1ZXN0OgpbTW9ybyBldCBhbC4sIDIwMTRdIFMuIE1vcm8sIFAuIENvcnRleiBhbmQgUC4gUml0YS4gQSBEYXRhLURyaXZlbiBBcHByb2FjaCB0byBQcmVkaWN0IHRoZSBTdWNjZXNzIG9mIEJhbmsgVGVsZW1hcmtldGluZy4gRGVjaXNpb24gU3VwcG9ydCBTeXN0ZW1zLCBFbHNldmllciwgNjI6MjItMzEsIEp1bmUgMjAxNAoKCgoKKipJbnB1dCB2YXJpYWJsZXM6KioKICAgVGhlIGRhdGEgaXMgcmVsYXRlZCB3aXRoIGRpcmVjdCBtYXJrZXRpbmcgY2FtcGFpZ25zIG9mIGEgUG9ydHVndWVzZSBiYW5raW5nIGluc3RpdHV0aW9uLiAKICAgVGhlIG1hcmtldGluZyBjYW1wYWlnbnMgd2VyZSBiYXNlZCBvbiBwaG9uZSBjYWxscy4gT2Z0ZW4sIG1vcmUgdGhhbiBvbmUgY29udGFjdCB0byB0aGUgc2FtZSBjbGllbnQgd2FzIHJlcXVpcmVkLCBpbiBvcmRlciB0byBhY2Nlc3MgaWYgdGhlIHByb2R1Y3QgKGJhbmsgdGVybSBkZXBvc2l0KSB3b3VsZCBiZSAob3Igbm90KSBzdWJzY3JpYmVkLiAKICAgCiAgIFRoZSBjbGFzc2lmaWNhdGlvbiBnb2FsIGlzIHRvIHByZWRpY3QgaWYgdGhlIGNsaWVudCB3aWxsIHN1YnNjcmliZSBhIHRlcm0gZGVwb3NpdCAodmFyaWFibGUgeSkuCgo1LiBOdW1iZXIgb2YgSW5zdGFuY2VzOiA0NTIxMSBmb3IgYmFuay1mdWxsLmNzdiAoNDUyMSBmb3IgYmFuay5jc3YpCgo2LiBOdW1iZXIgb2YgQXR0cmlidXRlczogMTYgKyBvdXRwdXQgYXR0cmlidXRlLgoKNy4gQXR0cmlidXRlIGluZm9ybWF0aW9uOgoKICAgRm9yIG1vcmUgaW5mb3JtYXRpb24sIHJlYWQgW01vcm8gZXQgYWwuLCAyMDExXS4KCiAgIElucHV0IHZhcmlhYmxlczoKICAgIyBiYW5rIGNsaWVudCBkYXRhOgogICAqMSAtIGFnZSAobnVtZXJpYykKICAgKjIgLSBqb2IgOiB0eXBlIG9mIGpvYiAoY2F0ZWdvcmljYWw6ICJhZG1pbi4iLCJ1bmtub3duIiwidW5lbXBsb3llZCIsIm1hbmFnZW1lbnQiLCJob3VzZW1haWQiLCJlbnRyZXByZW5ldXIiLCJzdHVkZW50IiwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgImJsdWUtY29sbGFyIiwic2VsZi1lbXBsb3llZCIsInJldGlyZWQiLCJ0ZWNobmljaWFuIiwic2VydmljZXMiKSAKICAgKjMgLSBtYXJpdGFsIDogbWFyaXRhbCBzdGF0dXMgKGNhdGVnb3JpY2FsOiAibWFycmllZCIsImRpdm9yY2VkIiwic2luZ2xlIjsgbm90ZTogImRpdm9yY2VkIiBtZWFucyBkaXZvcmNlZCBvciB3aWRvd2VkKQogICogNCAtIGVkdWNhdGlvbiAoY2F0ZWdvcmljYWw6ICJ1bmtub3duIiwic2Vjb25kYXJ5IiwicHJpbWFyeSIsInRlcnRpYXJ5IikKICAqIDUgLSBkZWZhdWx0OiBoYXMgY3JlZGl0IGluIGRlZmF1bHQ/IChiaW5hcnk6ICJ5ZXMiLCJubyIpCiAgICo2IC0gYmFsYW5jZTogYXZlcmFnZSB5ZWFybHkgYmFsYW5jZSwgaW4gZXVyb3MgKG51bWVyaWMpIAogICogNyAtIGhvdXNpbmc6IGhhcyBob3VzaW5nIGxvYW4/IChiaW5hcnk6ICJ5ZXMiLCJubyIpCiAgICo4IC0gbG9hbjogaGFzIHBlcnNvbmFsIGxvYW4/IChiaW5hcnk6ICJ5ZXMiLCJubyIpCiAgICMgcmVsYXRlZCB3aXRoIHRoZSBsYXN0IGNvbnRhY3Qgb2YgdGhlIGN1cnJlbnQgY2FtcGFpZ246CiAgKiA5IC0gY29udGFjdDogY29udGFjdCBjb21tdW5pY2F0aW9uIHR5cGUgKGNhdGVnb3JpY2FsOiAidW5rbm93biIsInRlbGVwaG9uZSIsImNlbGx1bGFyIikgCiAgKjEwIC0gZGF5OiBsYXN0IGNvbnRhY3QgZGF5IG9mIHRoZSBtb250aCAobnVtZXJpYykKICAqMTEgLSBtb250aDogbGFzdCBjb250YWN0IG1vbnRoIG9mIHllYXIgKGNhdGVnb3JpY2FsOiAiamFuIiwgImZlYiIsICJtYXIiLCAuLi4sICJub3YiLCAiZGVjIikKICAqMTIgLSBkdXJhdGlvbjogbGFzdCBjb250YWN0IGR1cmF0aW9uLCBpbiBzZWNvbmRzIChudW1lcmljKQogICAjIG90aGVyIGF0dHJpYnV0ZXM6CiAgKjEzIC0gY2FtcGFpZ246IG51bWJlciBvZiBjb250YWN0cyBwZXJmb3JtZWQgZHVyaW5nIHRoaXMgY2FtcGFpZ24gYW5kIGZvciB0aGlzIGNsaWVudCAobnVtZXJpYywgaW5jbHVkZXMgbGFzdCBjb250YWN0KQogICoxNCAtIHBkYXlzOiBudW1iZXIgb2YgZGF5cyB0aGF0IHBhc3NlZCBieSBhZnRlciB0aGUgY2xpZW50IHdhcyBsYXN0IGNvbnRhY3RlZCBmcm9tIGEgcHJldmlvdXMgY2FtcGFpZ24gKG51bWVyaWMsIC0xIG1lYW5zIGNsaWVudCB3YXMgbm90IHByZXZpb3VzbHkgY29udGFjdGVkKQogICoxNSAtIHByZXZpb3VzOiBudW1iZXIgb2YgY29udGFjdHMgcGVyZm9ybWVkIGJlZm9yZSB0aGlzIGNhbXBhaWduIGFuZCBmb3IgdGhpcyBjbGllbnQgKG51bWVyaWMpCiAgKjE2IC0gcG91dGNvbWU6IG91dGNvbWUgb2YgdGhlIHByZXZpb3VzIG1hcmtldGluZyBjYW1wYWlnbiAoY2F0ZWdvcmljYWw6ICJ1bmtub3duIiwib3RoZXIiLCJmYWlsdXJlIiwic3VjY2VzcyIpCgogIE91dHB1dCB2YXJpYWJsZSAoZGVzaXJlZCB0YXJnZXQpOgogICoxNyAtIHkgLSBoYXMgdGhlIGNsaWVudCBzdWJzY3JpYmVkIGEgdGVybSBkZXBvc2l0PyAoYmluYXJ5OiAieWVzIiwibm8iKQoKKjguIE1pc3NpbmcgQXR0cmlidXRlIFZhbHVlczogTm9uZQoKCgoKCgoKYGBge3J9CgpwYWNtYW46OnBfbG9hZCgibHVicmlkYXRlIiwgImRwbHlyIiwgIm1hZ3JpdHRyIikKCmxpYnJhcnkocmlvKQpsaWJyYXJ5KGRvUGFyYWxsZWwpCmxpYnJhcnkodmlyaWRpcykKbGlicmFyeShSQ29sb3JCcmV3ZXIpCmxpYnJhcnkodGlkeXZlcnNlKQpsaWJyYXJ5KGdndGhlbWVzKQpsaWJyYXJ5KGtuaXRyKQpsaWJyYXJ5KHRpZHl2ZXJzZSkKbGlicmFyeShjYXJldCkKbGlicmFyeShjYXJldEVuc2VtYmxlKQpsaWJyYXJ5KHBsb3RseSkKbGlicmFyeShsaW1lKQpsaWJyYXJ5KHBsb3RST0MpCgojIENhbGN1bGF0ZSB0aGUgbnVtYmVyIG9mIGNvcmVzCm5vX2NvcmVzIDwtIGRldGVjdENvcmVzKCkgLSAxCgpjbDwtbWFrZUNsdXN0ZXIobm9fY29yZXMpCnJlZ2lzdGVyRG9QYXJhbGxlbChjbCkKCgpzZXR3ZCgiL1VzZXJzL25hbmFha3dhc2lhYmF5aWVib2F0ZW5nL0RvY3VtZW50cy9tZW1waGlzY2xhc3Nlc2Jvb2tzL0RhdGFNaW5pbmdzY2llbmNlIikKI2xvYWQgZXhjZWwgZmlsZSB3aXRoIHJpbwpEYXRhPC0gcmlvOjppbXBvcnQoIi9Vc2Vycy9uYW5hYWt3YXNpYWJheWllYm9hdGVuZy9Eb2N1bWVudHMvbWVtcGhpc2NsYXNzZXNib29rcy9EYXRhTWluaW5nc2NpZW5jZS9Bbm9tYWx5ZGV0ZWN0aW9uL2JhbmsvYmFuay1mdWxsLmNzdiIpCgpEYXRhJT4laGVhZAojJT4la2FibGUoKQoKCmBgYAoKCgoKCmBgYHtyfQoKc3VtbWFyeShEYXRhKQoKCmBgYAoKCgoKVGhlcmUgYXJlIG5vIG1pc3Npbmcgb2JzZXJ2YXRpb25zIGluIHRoZSBkYXRhLgpgYGB7cn0KIz09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQojY2hlY2sgdGhlIG51bWJlciBvZiBtaXNzaW5nIHJvd3MKIz09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQoKY29sU3Vtcyhpcy5uYS5kYXRhLmZyYW1lKERhdGEpKQoKCmBgYAoKCgpgYGB7cn0KIz09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQojIGRlc2NyaXB0aXZlL3N1bW1hcnkgc3RhdGlzdGljcwojPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09CgoKCkhtaXNjOjpkZXNjcmliZS5kYXRhLmZyYW1lKERhdGEpICAgCiNkZXNjcmliZShEYXRhKQpgYGAKCgoKYGBge3J9CiM9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KIyBIaXN0b2dyYW1zCiM9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KdGhlbWVfc2V0KHRoZW1lX2Vjb25vbWlzdF93aGl0ZSgpKQoKI2dncGxvdChEYXRhKSArIGdlb21fYm94cGxvdChhZXMoeCA9YWdlLHk9ZHVyYXRpb24sY29sb3I9eSkpCgpnZ3Bsb3QoRGF0YSwgYWVzKHggPSIiLHk9YWdlLCBmaWxsPXkpKSsgZ2VvbV9ib3hwbG90KCkrbGFicyh4PSJhZ2UiLHk9IiIpCiNnZ3Bsb3RseShwKQoKZ2dwbG90KERhdGEsIGFlcyh4ID1kdXJhdGlvbiwgZmlsbD15KSkrIGdlb21faGlzdG9ncmFtKGJpbnMgPSAzMCkKI2dncGxvdGx5KCkKCmdncGxvdChEYXRhLCBhZXMoeCA9YWdlLCBmaWxsPXkpKSsgZ2VvbV9oaXN0b2dyYW0oYmlucyA9IDMwKQojIGdncGxvdGx5KCkKIApnZ3Bsb3QoRGF0YSwgYWVzKHggPWRheSwgZmlsbD15KSkrIGdlb21faGlzdG9ncmFtKGJpbnMgPSAzMCkKI2dncGxvdGx5KCkKCmdncGxvdChEYXRhLCBhZXMoeCA9YmFsYW5jZSwgZmlsbD15KSkrIGdlb21faGlzdG9ncmFtKGJpbnMgPSAzMCkKI2dncGxvdGx5KCkKCmdncGxvdChEYXRhLCBhZXMoeCA9YWdlLCBmaWxsPXkpKSsgZ2VvbV9oaXN0b2dyYW0oYmlucyA9IDMwKQojIGdncGxvdGx5KCkgCiAjIGdlb21fZGVuc2l0eShhbHBoYT0xLzMsY29sb3I9InJlZCIpICsgc2NhbGVfZmlsbF9odWUoKQoKZ2dwbG90KERhdGEsIGFlcyh4PWFnZSwgZmlsbD15KSkgKyBnZW9tX2RlbnNpdHkoYWxwaGE9MS8zKSArIHNjYWxlX2ZpbGxfaHVlKCkKCmdncGxvdGx5KCkKICAgICAgIApgYGAKCgoKCgoKQ29udmVydCBjaGFyYWN0ZXIgdmFyaWFibGVzIHRvIGZhY3RvciB2YXJpYWJsZXMuVGhpcyBpcyBuZWNjZXNzYXJ5IGZvciB0aGUgY2FyZXQgcGFja2FnZSB0byB0cmFpbiB0aGUgbW9kZWxzIHdlIGFyZSBpbnRlcmVzdGVkIGluIGxhdGVyLgoKYGBge3J9CgpEYXRhPC1EYXRhICU+JSBtdXRhdGVfaWYoaXMuY2hhcmFjdGVyLCBhcy5mYWN0b3IpIApzdHIoRGF0YSkKCmdncGxvdChEYXRhLCBhZXMoeCA9eSkpKyBnZW9tX2hpc3RvZ3JhbShiaW5zID0gNCxzdGF0PSJjb3VudCIpK2xhYnMoeD0iVGVybSBEZXBvc2l0IikKCmBgYAoKCmBgYHtyfQojPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09CgojQ29udmVydGluZyBvdXRjb21lIHZhcmlhYmxlIHRvIG51bWVyaWMKIz09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQoKRGF0YSR5PC1pZmVsc2UoRGF0YSR5PT0nbm8nLDAsMSkKCgoKc3RyKERhdGEpCgpnbGltcHNlKERhdGEpCmBgYAoKCkNvbnZlcnQgQ2F0ZWdvcmljYWwgdmFyaWFsYmVzIHRvIGR1bW15IHZhcmlhYmxlcyB1c2luZyBlaXRoZXIgIE1vZGVsLm1hdHJpeCBvciBzcGFyc2UubW9kZWwsbWF0cml4CgpgYGB7cn0KCgpwcmVkaWN0b3JzPC1zZXRkaWZmKG5hbWVzKERhdGEpLERhdGEkeSkKCiNuYW1lcyhEYXRhKQpwcmVkaWN0b3JzPC1uYW1lcyhEYXRhWywtMTddKQoKI3Bhc3RlMChwcmVkaWN0b3JzLHNlcD0iIixjb2xsYXBzZSA9ICIrICIpCgojcGFzdGUoIn4iLHBhc3RlMChwcmVkaWN0b3JzLHNlcD0iIixjb2xsYXBzZSA9ICIrICIpKQoKYXMuZm9ybXVsYShwYXN0ZSgifiIscGFzdGUwKHByZWRpY3RvcnMsc2VwPSIiLGNvbGxhcHNlID0gIisgIikpKQoKZDE8LW1vZGVsLm1hdHJpeChhcy5mb3JtdWxhKHBhc3RlKCJ+IixwYXN0ZTAocHJlZGljdG9ycyxzZXA9IiIsY29sbGFwc2UgPSAiKyAiKSkpLCBEYXRhKQoKZDFiPC1NYXRyaXg6OnNwYXJzZS5tb2RlbC5tYXRyaXgoYXMuZm9ybXVsYShwYXN0ZSgifiIscGFzdGUwKHByZWRpY3RvcnMsc2VwPSIiLGNvbGxhcHNlID0gIisgIikpKSwgRGF0YSkKCgoKI2QxPC1tb2RlbC5tYXRyaXgofmFnZStqb2IrbWFyaXRhbCtlZHVjYXRpb24gKyBkZWZhdWx0ICsgYmFsYW5jZSArIGhvdXNpbmcgKyAKIyAgICBsb2FuICsgY29udGFjdCArIGRheSArIG1vbnRoICsgZHVyYXRpb24gKyBjYW1wYWlnbiArIHBkYXlzICsgCiMgICAgcHJldmlvdXMgKyBwb3V0Y29tZSxEYXRhKQoKI2QxYjwtTWF0cml4OjpzcGFyc2UubW9kZWwubWF0cml4KH5hZ2Uram9iK21hcml0YWwrZWR1Y2F0aW9uICsgZGVmYXVsdCArIGJhbGFuY2UgKyBob3VzaW5nICsgCiMgICAgbG9hbiArIGNvbnRhY3QgKyBkYXkgKyBtb250aCArIGR1cmF0aW9uICsgY2FtcGFpZ24gKyBwZGF5cyArIAojICAgIHByZXZpb3VzICsgcG91dGNvbWUsRGF0YSkKCmhlYWQoZDEpCgpoZWFkKGQxYikKCmBgYAoKVGhlIGR1bW15IGNvbnZlcnNpb24gcmVzdWx0cyBpbiA0MiB2YXJpYWJsZXMuCmBgYHtyfQojPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09CiNjb252ZXJ0IGNhdGVnb3JpY2FsIHZhcmlhYmxlcyAgdG8gbnVtZXJpYyB2YXJpYWJsZXMKIz09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQoKCmRteSA8LSBkdW1teVZhcnMoIiB+IC4iLCBkYXRhID0gRGF0YSxmdWxsUmFuayA9IFQpCgp0cmFuc2Zvcm1lZCA8LSBkYXRhLmZyYW1lKHByZWRpY3QoZG15LCBuZXdkYXRhID1EYXRhKSkKCgojQ2hlY2tpbmcgdGhlIHN0cnVjdHVyZSBvZiB0cmFuc2Zvcm1lZCB0cmFpbiBmaWxlCgpzdHIodHJhbnNmb3JtZWQpCgpgYGAKCgoKCmBgYHtyfQojPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09CiNDb252ZXJ0aW5nIHRoZSBkZXBlbmRlbnQgdmFyaWFibGUgYmFjayB0byBjYXRlZ29yaWNhbAojPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09Cgp0cmFuc2Zvcm1lZCR5PC1hcy5mYWN0b3IodHJhbnNmb3JtZWQkeSkKCgoKCmBgYAoKCgpgYGB7cixtZXNzYWdlPUZBTFNFLHdhcm5pbmc9RkFMU0V9CiM9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KI1NwbGl0aW5nIHRyYWluaW5nIHNldCBpbnRvIHR3byBwYXJ0cyBiYXNlZCBvbiBvdXRjb21lOiA3MCUgYW5kIDMwJQojPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09CgppbmRleDwtY3JlYXRlRGF0YVBhcnRpdGlvbih0cmFuc2Zvcm1lZCR5LHA9MC43MCwgbGlzdD1GQUxTRSkKCnRyYWluU2V0PC10cmFuc2Zvcm1lZFtpbmRleCxdCgp0ZXN0U2V0PC10cmFuc2Zvcm1lZFstaW5kZXgsXQoKb3V0Y29tZU5hbWU8LSd5JwoKcHJlZGljdG9yczwtbmFtZXModHJhaW5TZXQpWyFuYW1lcyh0cmFpblNldCkgJWluJSBvdXRjb21lTmFtZV0KCnByZWRpY3RvcnMKCmBgYAoKCgoKCgpgYGB7cix3YXJuaW5nPUZBTFNFLG1lc3NhZ2U9RkFMU0V9CiM9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KI0ZlYXR1cmUgc2VsZWN0aW9uIHVzaW5nIHJmZSBpbiBjYXJldChyZWN1cnNpdmUgZmVhdHVyZSBleHRyYWN0aW9uKQojcHJlZGljdG9yczwtbmFtZXModHJhaW5TZXQpWyFuYW1lcyh0cmFpblNldCkgJWluJSBvdXRjb21lTmFtZV0KI0FsdGVybmF0aXZlbHkKI3ByZWRpY3RvcnM8LXNldGRpZmYobmFtZXModHJhaW5TZXQpLG91dGNvbWVOYW1lKQojPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09CgpsaWJyYXJ5KHJhbmRvbUZvcmVzdCkKCmNvbnRyb2wgPC0gcmZlQ29udHJvbChmdW5jdGlvbnMgPSByZkZ1bmNzLAogICAgICAgICAgICAgICAgICAgICAgbWV0aG9kID0gInJlcGVhdGVkY3YiLAogICAgICAgICAgICAgICAgICAgICAgcmVwZWF0cyA9IDMsCiAgICAgICAgICAgICAgICAgICAgICB2ZXJib3NlID0gRkFMU0UpCm91dGNvbWVOYW1lPC0neScKCnByZWRpY3RvcnM8LW5hbWVzKHRyYWluU2V0KVshbmFtZXModHJhaW5TZXQpICVpbiUgb3V0Y29tZU5hbWVdCgpmZWF0dXJlX3NlbGVjdCA8LSByZmUodHJhaW5TZXRbLHByZWRpY3RvcnNdLCB0cmFpblNldFssb3V0Y29tZU5hbWVdLAogICAgICAgICAgICAgICAgICAgICAgICAgcmZlQ29udHJvbCA9IGNvbnRyb2wpCgoKCgoKdGFibGUodHJhaW5TZXQkeSkKCgpmZWF0dXJlX3NlbGVjdAoKCm5hbWVzKGZlYXR1cmVfc2VsZWN0KQoKCgpwcmVkaWN0b3JzKGZlYXR1cmVfc2VsZWN0KQoKc3VtbWFyeShmZWF0dXJlX3NlbGVjdCkKCmZlYXR1cmVfc2VsZWN0WyAiYmVzdFN1YnNldCJdCgpgYGAKCgoKCgpgYGB7cn0KCiNzYXZlKGZlYXR1cmVfc2VsZWN0LGZpbGU9ImZlYXR1cmVfc2VsZWN0LlJEYXRhIikKIApsb2FkKCJmZWF0dXJlX3NlbGVjdC5SRGF0YSIpCgoKCmBgYAoKCgoKCmBgYHtyfQoKI1RoZSB0b3AgNSB2YXJpYWJsZXMgKG91dCBvZiA0Mik6CiNwcmludCgiVGhlIHRvcCA1IHZhcmlhYmxlcyAob3V0IG9mIDQyKVxuIikKY2F0KCJUaGUgdG9wIDUgdmFyaWFibGVzIChvdXQgb2YgNDIpXG4iKQpjYXQoImR1cmF0aW9uLCBwb3V0Y29tZXN1Y2Nlc3MsIG1vbnRobWFyLCBjb250YWN0dW5rbm93biwgaG91c2luZ3llc1xuIikgICAKCgpwcmVkaWN0b3JzKGZlYXR1cmVfc2VsZWN0KQoKYGBgCgoKCgoKCgoKCgoKCmBgYHtyfQojPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KIyBwbG90IHZhcmlhYmxlIHNlbGVjdGlvbgojPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KCgp0cmVsbGlzLnBhci5zZXQoY2FyZXRUaGVtZSgpKQoKcGxvdChmZWF0dXJlX3NlbGVjdCwgdHlwZSA9IGMoICJvIiwiZyIpKQoKYGBgCgpBYm91dCA4KDE0IHZhcmlhYmxlcyApIGZlYXR1cmVzIHByb3ZpZGVzIHRoZSBvcHRpbWFsICBhY2N1cmFjeSBmb3IgdHJhaW5pbmcuCgpUaGUgdG9wIDUgdmFyaWFibGVzIHByb3ZpZGVzIGFuIGFjY3VyYWN5IG9mIGFib3V0ICA5MCUgZm9yIHRoZSBkYXRhLiBUaGUgcmVtYWluaW5nIDM2ICB2YXJpYWJsZXMgYWRkICBsZXNzIHRoYW4gMC4xIC4gVGhpcyBpcyB0aGUgYWR2YW50YWdlIG9mIGZlYXR1cmUgZW5naW5lZXJpbmcuIEl0IGhlbHBzIHRvIHJlZHVjZSBjb21wbGV4aXR5IGluIHRoZSBtb2RlbCwgcmVkdWNlIG92ZXJmaXR0aW5nIGFuZCBhbHNvIGNvbXB1dGF0aW9uYWx5IHRpbWUuCgoKCmBgYHtyLHdhcm5pbmc9RkFMU0UsbWVzc2FnZT1GQUxTRX0KIz09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09CiNUYWtpbmcgb25seSB0aGUgdG9wIDUgcHJlZGljdG9ycwojQWdlLCBFbXBsb3ltZW50LlByaW1hcmlseS5yZXRpcmVkLi5vciwgRWR1Y2F0aW9uLjEwdGguR3JhZGUsIAojRW1wbG95bWVudC5VbmFibGUudG8ud29yay4sIHJhY2VfYmxhY2suWWVzIGZyb20gZmVhdHVyZV9zZWxlY3QKIyBDcyBmdW5jdGlvbiBmcm9tIEhtaXNjIGNvbnZlcnRzIG5hbWVzIHRvIGNoYXJhY3RlciB2YXJpYWJsZXMKCiM9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQoKCgpsaWJyYXJ5KEhtaXNjKQoKcD1jKHBhc3RlMChwcmVkaWN0b3JzKGZlYXR1cmVfc2VsZWN0KSxzZXA9IiwiLGNvbGxhcHNlID0gIiIpKQoKI3RyYWluU2V0WyxdCgpwcmVkaWN0b3I9SG1pc2M6OkNzKGR1cmF0aW9uLHBvdXRjb21lLnN1Y2Nlc3MsbW9udGgubWFyLGNvbnRhY3QudW5rbm93bixhZ2UsaG91c2luZy55ZXMsZGF5LG1vbnRoLm9jdCxwZGF5cyxtb250aC5zZXAsbW9udGguanVsLG1vbnRoLmRlYyxtb250aC5tYXksbW9udGguYXVnLGNhbXBhaWduLG1vbnRoLmp1biApCgoKCgpwcmludCgiVGhlIHNldCBvZiBmZWF0dXJlcyBzZWxlY3RlZCBpczpcbiIpIAoKcAoKCgp0cmFpblNldFssb3V0Y29tZU5hbWVdJT4laGVhZCgpCgpjbGFzcyh0cmFpblNldFssb3V0Y29tZU5hbWVdKQoKVG9wZml2ZXByZWQgPSBIbWlzYzo6Q3MoZHVyYXRpb24sIHBvdXRjb21lc3VjY2VzcywgbW9udGhtYXIsIGNvbnRhY3R1bmtub3duLCBob3VzaW5neWVzKSAKCnRyYWluU2V0WyxUb3BmaXZlcHJlZF0lPiVoZWFkKCkKCgpgYGAKCgoqTG9naXN0aWMgUmVncmVzc2lvbiBNb2RlbCoKCmBgYHtyfQptb2RlbF9nbG08LXRyYWluKHRyYWluU2V0WyxUb3BmaXZlcHJlZF0sYXMuZmFjdG9yKHRyYWluU2V0JHkpLG1ldGhvZD0nZ2xtJyxmYW1pbHk9ImJpbm9taWFsIikKCgpzdW1tYXJ5KG1vZGVsX2dsbSkKYGBgCgoKCgoKVGhlIGFjY3VyYWN5IG9mIHRoZSBsb2dpc3RpYyByZWdyZXNzaW9uIG1vZGVsIGFib3V0IDkwLjQgJQpgYGB7cn0KCgoKCiNzYXZlKG1vZGVsX2dsbSxmaWxlPSJtb2RlbF9nbG0uUkRhdGEiKQogCmxvYWQoIm1vZGVsX2dsbS5SRGF0YSIpCiAKIAojIFByZWRpY3QgdXNpbmcgdGhlIHRlc3QgZGF0YQpwcmVkPC1wcmVkaWN0KG1vZGVsX2dsbSx0ZXN0U2V0KQoKbXlfZGF0YT1kYXRhLmZyYW1lKGNiaW5kKHByZWRpY3RlZD1wcmVkLG9ic2VydmVkPXRlc3RTZXQkeSkpCgpnZ3Bsb3QobXlfZGF0YSxhZXMocHJlZGljdGVkLG9ic2VydmVkKSkrZ2VvbV9wb2ludCgpK2dlb21fc21vb3RoKG1ldGhvZD1sbSkrZ2d0aXRsZSgnbG9naXN0aWMgbW9kZWwnKQoKIyBQcmludCwgcGxvdCB2YXJpYWJsZSBpbXBvcnRhbmNlCnByaW50KHZhckltcChtb2RlbF9nbG0sIHNjYWxlID0gRkFMU0UpKQoKcGxvdCh2YXJJbXAobW9kZWxfZ2xtLCBzY2FsZSA9IEZBTFNFKSwgbWFpbj0iVmFyaWFibGUgSW1wb3J0YW5jZSB1c2luZyBsb2dpc3RpYy9nbG0iKQoKCmNvbmZ1c2lvbk1hdHJpeCh0ZXN0U2V0JHkscHJlZCkKCmBgYAoKCgoKCmBgYHtyLHdhcm5pbmc9RkFMU0UsbWVzc2FnZT1GQUxTRX0KIz09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQojUk9DUiBDdXJ2ZQojPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09CgoKCgpsaWJyYXJ5KHBST0MpCiNpbnN0YWxsLnBhY2thZ2VzKCJwUk9DIikKIyBDb21wdXRlIEFVQyBmb3IgcHJlZGljdGluZyBDbGFzcyB3aXRoIHRoZSB2YXJpYWJsZSBDcmVkaXRIaXN0b3J5LkNyaXRpY2FsCmYxID0gcm9jKERhdGEkeSB+IERhdGEkZHVyYXRpb24sIGRhdGE9dHJhaW5TZXQpIApwbG90KGYxLCBjb2w9InJlZCIpCgoKCnA9Z2dwbG90KHRyYWluU2V0LCBhZXMoZCA9IHksIG0gPSBkdXJhdGlvbikpICsgZ2VvbV9yb2MoKSsgc3R5bGVfcm9jKCkKCgpwbG90X2ludGVyYWN0aXZlX3JvYygocCkpCgojRHJhdyB0aGUgUk9DIGN1cnZlIApnbG0ucHJvYnMgPC0gcHJlZGljdChtb2RlbF9nbG0sdGVzdFNldCx0eXBlPSJwcm9iIikKCmhlYWQoZ2xtLnByb2JzKQogCiMgCiMgZ2xtLlJPQyA8LSByb2MocHJlZGljdG9yPWdsbS5wcm9icyRQUywKIyAgICAgICAgICAgICAgICByZXNwb25zZT10ZXN0U2V0JHksCiMgICAgICAgICAgICAgICAgbGV2ZWxzPXJldihsZXZlbHModGVzdFNldCR5KSkpCiMgZ2JtLlJPQyRhdWMKIyAKIyAjQXJlYSB1bmRlciB0aGUgY3VydmU6IDAuODczMQojIHBsb3QoZ2JtLlJPQyxtYWluPSJHTE0gUk9DIikKCmBgYAoKCgoKYGBge3Isd2FybmluZz1GQUxTRSxtZXNzYWdlPUZBTFNFfQojPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KIyBtdWx0aXBsZSBhbGdvcml0aG1zCiM9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQoKZml0Q29udHJvbCA8LSB0cmFpbkNvbnRyb2woCiAgbWV0aG9kID0gInJlcGVhdGVkY3YiLAogIG51bWJlciA9IDUsY2xhc3NQcm9icyA9IFRSVUUsCiAgcmVwZWF0cyA9IDUpCgptZXQ9YygiTG9naXRCb29zdCIsICd4Z2JUcmVlJywgJ3JmJywgJ3N2bVJhZGlhbCcpCgoKCnJlPWxpc3QoKQoKZm9yIChpIGluIHNlcV9hbG9uZyhtZXQpKSB7CiAgCiByZVtbaV1dPC10cmFpbih0cmFpblNldFssVG9wZml2ZXByZWRdLHRyYWluU2V0WyxvdXRjb21lTmFtZV0sbWV0aG9kPW1ldFtpXSwKICAgICAgICAgICAgICAgIAogICAgICAgICAgICAgICAgICAgIHByZVByb2Nlc3MgPSBjKCJjZW50ZXIiLCJzY2FsZSIpKSAgCn0KCgpgYGAKClNhdmUgbW9kZWxzIHRvIGF2b2lkIHJ1bm5pbmcgdGhlIG1vZGVsIGV2ZXJ5IHRpbWUuCmBgYHtyfQoKCnNhdmUocmUsZmlsZT0ibXVsdGlwbGVhbGdvcml0aG1zLlJEYXRhIikKIApsb2FkKCJtdWx0aXBsZWFsZ29yaXRobXMuUkRhdGEiKQogCiAKI3JlW1sxXV0KI3JlW1syXV0KI3JlW1szXV0KI3JlW1s0XV0KCiNyZSU+JWFzLmxpc3QuZGF0YS5mcmFtZSgpJT4la2FibGUoKQoKCgpgYGAKCgoKKkJvb3N0ZWQgTG9naXN0aWMgUmVncmVzc2lvbiogCgpgYGB7cn0KCnJlW1sxXV0lPiVoZWFkKCkKCnJlW1sxXV1bWzRdXSU+JWthYmxlKCkKCgoKcGxvdChyZVtbMV1dKQoKCnZhckltcChvYmplY3Q9cmVbWzFdXSkKCgojIFByZWRpY3QgdXNpbmcgdGhlIHRlc3QgZGF0YQpwcmVkPC1wcmVkaWN0KHJlW1sxXV0sdGVzdFNldCkKCm15X2RhdGE9ZGF0YS5mcmFtZShjYmluZChwcmVkaWN0ZWQ9cHJlZCxvYnNlcnZlZD10ZXN0U2V0JHkpKQoKZ2dwbG90KG15X2RhdGEsYWVzKHByZWRpY3RlZCxvYnNlcnZlZCkpK2dlb21fcG9pbnQoKStnZW9tX3Ntb290aChtZXRob2Q9bG0pK2dndGl0bGUoJ0Jvb3N0ZWQgTG9naXN0aWMgUmVncmVzc2lvbicpCgojIFByaW50LCBwbG90IHZhcmlhYmxlIGltcG9ydGFuY2UKcHJpbnQodmFySW1wKHJlW1sxXV0sIHNjYWxlID0gRkFMU0UpKQoKcGxvdCh2YXJJbXAocmVbWzFdXSwgc2NhbGUgPSBGQUxTRSksIG1haW49IkJvb3N0ZWQgTG9naXN0aWMgUmVncmVzc2lvbiIpCgoKY29uZnVzaW9uTWF0cml4KHRlc3RTZXQkeSxwcmVkKQoKYGBgCgoKKkVYdHJlbWUgR3JhZGllbnQgQm9vc3RpbmcqIApgYGB7cn0KCiNyZVtbMl1dCgpyZVtbMl1dW1s0XV0lPiVoZWFkKCklPiVrYWJsZSgpCgoKcGxvdChyZVtbMl1dKQoKCnZhckltcChvYmplY3Q9cmVbWzJdXSkKCgojIFByZWRpY3QgdXNpbmcgdGhlIHRlc3QgZGF0YQpwcmVkPC1wcmVkaWN0KHJlW1syXV0sdGVzdFNldCkKCm15X2RhdGE9ZGF0YS5mcmFtZShjYmluZChwcmVkaWN0ZWQ9cHJlZCxvYnNlcnZlZD10ZXN0U2V0JHkpKQoKZ2dwbG90KG15X2RhdGEsYWVzKHByZWRpY3RlZCxvYnNlcnZlZCkpK2dlb21fcG9pbnQoKStnZW9tX3Ntb290aChtZXRob2Q9bG0pK2dndGl0bGUoJ0VYdHJlbWUgR3JhZGllbnQgQm9vc3RpbmcnKQoKIyBQcmludCwgcGxvdCB2YXJpYWJsZSBpbXBvcnRhbmNlCnByaW50KHZhckltcChyZVtbMl1dLCBzY2FsZSA9IEZBTFNFKSkKCnBsb3QodmFySW1wKHJlW1syXV0sIHNjYWxlID0gRkFMU0UpLCBtYWluPSJFWHRyZW1lIEdyYWRpZW50IEJvb3N0aW5nIikKCgpjb25mdXNpb25NYXRyaXgodGVzdFNldCR5LHByZWQpCmBgYAoKCipSYW5kb20gRm9yZXN0KgpgYGB7cn0KcmVbWzNdXQoKcmVbWzNdXVtbNF1dJT4la2FibGUoKQoKcGxvdChyZVtbM11dKQoKCnZhckltcChvYmplY3Q9cmVbWzNdXSkKCgojIFByZWRpY3QgdXNpbmcgdGhlIHRlc3QgZGF0YQpwcmVkPC1wcmVkaWN0KHJlW1szXV0sdGVzdFNldCkKCm15X2RhdGE9ZGF0YS5mcmFtZShjYmluZChwcmVkaWN0ZWQ9cHJlZCxvYnNlcnZlZD10ZXN0U2V0JHkpKQoKZ2dwbG90KG15X2RhdGEsYWVzKHByZWRpY3RlZCxvYnNlcnZlZCkpK2dlb21fcG9pbnQoKStnZW9tX3Ntb290aChtZXRob2Q9bG0pK2dndGl0bGUoJ1JhbmRvbSBGb3Jlc3QnKQoKIyBQcmludCwgcGxvdCB2YXJpYWJsZSBpbXBvcnRhbmNlCnByaW50KHZhckltcChyZVtbM11dLCBzY2FsZSA9IEZBTFNFKSkKCnBsb3QodmFySW1wKHJlW1szXV0sIHNjYWxlID0gRkFMU0UpLCBtYWluPSJSYW5kb20gRm9yZXN0IikKCgpjb25mdXNpb25NYXRyaXgodGVzdFNldCR5LHByZWQpCmBgYAoKCgoqU3VwcG9ydCBWZWN0b3IgTWFjaGluZXMgd2l0aCBSYWRpYWwgQmFzaXMgRnVuY3Rpb24gS2VybmVsKgoKYGBge3J9CmZpdENvbnRyb2wgPC0gdHJhaW5Db250cm9sKG1ldGhvZCA9ICJyZXBlYXRlZGN2IixudW1iZXIgPSA1LHJlcGVhdHMgPSA1LAogICAgICAgICAgICAgICAgICAgICAgICAgICBhbGxvd1BhcmFsbGVsID0gVFJVRSApCgoKbW9kZWxfc3ZtPC10cmFpbih0cmFpblNldFssVG9wZml2ZXByZWRdLHRyYWluU2V0WyxvdXRjb21lTmFtZV0sbWV0aG9kPSdzdm1SYWRpYWwnLAogICAgICAgICAgICAgICAgICB0ckNvbnRyb2w9Zml0Q29udHJvbCkKCiNwcmVkaWN0aW9uczwtcHJlZGljdC50cmFpbihvYmplY3Q9cmVbWzRdXSx0ZXN0U2V0WyxwcmVkaWN0b3JzXSx0eXBlPSJyYXciKQoKI3ByZWRpY3Rpb25zCgoKCmBgYAoKCgpgYGB7cn0KCm1vZGVsX3N2bQoKCnNhdmUobW9kZWxfc3ZtLGZpbGU9Im1vZGVsX3N2bS5SRGF0YSIpCiAKbG9hZCgibW9kZWxfc3ZtLlJEYXRhIikKCgoKcGxvdChtb2RlbF9zdm0pCgoKdmFySW1wKG9iamVjdD1tb2RlbF9zdm0pCgoKIyBQcmVkaWN0IHVzaW5nIHRoZSB0ZXN0IGRhdGEKI3ByZWQ8LXByZWRpY3QobW9kZWxfc3ZtLHRlc3RTZXQpCgpwcmVkaWN0aW9uczwtcHJlZGljdC50cmFpbihvYmplY3Q9bW9kZWxfc3ZtLHRlc3RTZXRbLFRvcGZpdmVwcmVkXSx0eXBlPSJyYXciKQoKbXlfZGF0YT1kYXRhLmZyYW1lKGNiaW5kKHByZWRpY3RlZD1wcmVkaWN0aW9ucyxvYnNlcnZlZD10ZXN0U2V0JHkpKQoKZ2dwbG90KG15X2RhdGEsYWVzKHByZWRpY3RlZCxvYnNlcnZlZCkpK2dlb21fcG9pbnQoKStnZW9tX3Ntb290aChtZXRob2Q9bG0pK2dndGl0bGUoJ1N1cHBvcnQgVmVjdG9yIE1hY2hpbmVzIHdpdGggUmFkaWFsIEJhc2lzIEZ1bmN0aW9uIEtlcm5lbCcpCgojIFByaW50LCBwbG90IHZhcmlhYmxlIGltcG9ydGFuY2UKcHJpbnQodmFySW1wKG1vZGVsX3N2bSwgc2NhbGUgPSBGQUxTRSkpCgpwbG90KHZhckltcChtb2RlbF9zdm0sIHNjYWxlID0gRkFMU0UpLCBtYWluPSJTdXBwb3J0IFZlY3RvciBNYWNoaW5lcyB3aXRoIFJhZGlhbCBCYXNpcyBGdW5jdGlvbiBLZXJuZWwiKQoKCmNvbmZ1c2lvbk1hdHJpeCh0ZXN0U2V0JHkscHJlZGljdGlvbnMpCgpwcmVkaWN0aW9ucyU+JWhlYWQoKQoKYGBgCgoKCgpGaW5kIG1vZGVscyB0aGF0IGFyZSBzdXBwb3J0ZWQgYnkgdGhlIGNhcmV0IHBhY2thZ2UuIFRoZXJlIGFyZSBvdmVyIDIwMCBtb2RlbHMgdGhhdCBjYW4gYmUgaW1wbGVtZW50ZWQgaW4gdGhlIGNhcmV0IHBhY2thZ2UgYXQgbGFzdCBjb3VudC4KYGBge3J9Cm5hbWVzKGdldE1vZGVsSW5mbygpKSU+JWhlYWQoKQpgYGAKCgoKCmBgYHtyfQojPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KI2dyYWRpZW50IGJvb3N0ZWQgdHJlZXMKIyBwYXJhbWV0ZXIgdHVuaW5nCiM9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQoKCmZpdENvbnRyb2wgPC0gdHJhaW5Db250cm9sKG1ldGhvZCA9ICJyZXBlYXRlZGN2IixudW1iZXIgPSA1LHJlcGVhdHMgPSA1LAogICAgICAgICAgICAgICAgICAgICAgICAgICBhbGxvd1BhcmFsbGVsID0gVFJVRSApCgoKCgojZml0Q29udHJvbCA8LSB0cmFpbkNvbnRyb2woCiMgIG1ldGhvZCA9ICJyZXBlYXRlZGN2IiwKIyAgbnVtYmVyID0gNSxjbGFzc1Byb2JzID0gVFJVRSwKIyAgcmVwZWF0cyA9IDUsYWxsb3dQYXJhbGxlbCA9IFRSVUUpCgojQ3JlYXRpbmcgZ3JpZApncmlkIDwtIGV4cGFuZC5ncmlkKG4udHJlZXM9YygxMCwyMCw1MCwxMDAsNTAwLDEwMDApLHNocmlua2FnZT1jKDAuMDEsMC4wNSwwLjEsMC41KSxuLm1pbm9ic2lubm9kZSA9IGMoMyw1LDEwKSxpbnRlcmFjdGlvbi5kZXB0aD1jKDEsNSwxMCkpCiMgdHJhaW5pbmcgdGhlIG1vZGVsCm1vZGVsX2dibTwtdHJhaW4odHJhaW5TZXRbLFRvcGZpdmVwcmVkXSx0cmFpblNldFssb3V0Y29tZU5hbWVdLG1ldGhvZD0nZ2JtJyx0ckNvbnRyb2w9Zml0Q29udHJvbCx0dW5lR3JpZD1ncmlkKQojIHN1bW1hcml6aW5nIHRoZSBtb2RlbApwcmludChtb2RlbF9nYm0pCgoKYGBgCgoKCgpgYGB7cn0KbW9kZWxMb29rdXAobW9kZWw9J2dibScpCmBgYAoKCgpgYGB7cn0KCiNtb2RlbF9nYm0lPiVhcy5saXN0LmRhdGEuZnJhbWUoKSU+JWthYmxlKCkKc2F2ZShtb2RlbF9nYm0sZmlsZT0ibW9kZWxfZ2JtLlJEYXRhIikKIApsb2FkKCJtb2RlbF9nYm0uUkRhdGEiKQogCiAKbW9kZWxfZ2JtJGJlc3RUdW5lJT4la2FibGUoKQogCm1vZGVsX2dibSRyZXN1bHRzJT4laGVhZCgpJT4la2FibGUoKQoKI3ZhcmlvdXMgb2YgZmluZGluZyB0aGUgcm93IHdpdGggbWF4aW11bSBhY2N1cmFjeQoKbW9kZWxfZ2JtJHJlc3VsdHNbd2hpY2gubWF4KG1vZGVsX2dibSRyZXN1bHRzJEFjY3VyYWN5KSxdICAgCiAgICAKICAgIAptb2RlbF9nYm0kcmVzdWx0cyU+JWZpbHRlcigpJT4lZHBseXI6OnN1bW1hcmlzZShtYXgxPW1heChBY2N1cmFjeSkpCiAgICAKICAgIAptb2RlbF9nYm0kcmVzdWx0cyAlPiUgZHBseXI6OnNsaWNlKHdoaWNoLm1pbihBY2N1cmFjeSApKQogICAgCm1vZGVsX2dibSRyZXN1bHRzJT4lZHBseXI6OnNsaWNlKHdoaWNoLm1heChBY2N1cmFjeSApKQogICAgCgptb2RlbF9nYm0kcmVzdWx0c1sgd2hpY2gobW9kZWxfZ2JtJHJlc3VsdHMkQWNjdXJhY3kgPT1tYXgobW9kZWxfZ2JtJHJlc3VsdHMkQWNjdXJhY3kpKSAgLF0KCnBsb3QobW9kZWxfZ2JtKQojIAojIHByZWQ8LXByZWRpY3QobW9kZWxfZ2JtLGlyaXNfdGVzdCkKIyAKIyBDb25mX21hdHJpeDwtY29uZnVzaW9uTWF0cml4KHByZWQsaXJpc1sxOjUsNV0pCiMgCiMga2FibGUoQ29uZl9tYXRyaXgkdGFibGUpCmBgYAoKCgoKYGBge3J9CiN1c2luZyB0dW5lIGxlbmd0aAoKZml0Q29udHJvbCA8LSB0cmFpbkNvbnRyb2wobWV0aG9kID0gInJlcGVhdGVkY3YiLG51bWJlciA9IDUscmVwZWF0cyA9IDUsCiAgICAgICAgICAgICAgICAgICAgICAgICAgIGFsbG93UGFyYWxsZWwgPSBUUlVFICkKCgptb2RlbF9nYm0yPC10cmFpbih0cmFpblNldFssVG9wZml2ZXByZWRdLHRyYWluU2V0WyxvdXRjb21lTmFtZV0sbWV0aG9kPSdnYm0nLAogICAgICAgICAgICAgICAgICB0ckNvbnRyb2w9Zml0Q29udHJvbCkKCgpgYGAKCldlIGNhbiB1c2UgdHVuZUxlbmd0aCAgaW5zdGVhZCBvZiBzcGVjaWZ5aW5nIHRoZSB2YWx1ZSBvZiBlYWNoIHBhcmFtZXRlci4KVGhpcyBhbGxvd3MgYW55IG51bWJlciBvZiBwb3NzaWJsZSB2YWx1ZXMgZm9yIGVhY2ggdHVuaW5nIHBhcmFtZXRlciB0aHJvdWdoIHR1bmVMZW5ndGguCgpgYGB7cn0KIyBtb2RlbF9nYm0zPC10cmFpbih0cmFpblNldFssVG9wZml2ZXByZWRdLHRyYWluU2V0WyxvdXRjb21lTmFtZV0sbWV0aG9kPSdnYm0nLAojICAgICAgICAgICAgICAgICAgIHRyQ29udHJvbD1maXRDb250cm9sLGludGVyYWN0aW9uLmRlcHRoPTEwLG4udHJlZXM9MTAwLG4ubWlub2JzaW5ub2RlPTEwKQojIHByaW50KG1vZGVsX2dibSkKYGBgCgoKCmBgYHtyfQoKc2F2ZShtb2RlbF9nYm0yLGZpbGU9Im1vZGVsX2dibTIuUkRhdGEiKQogCiBsb2FkKCJtb2RlbF9nYm0yLlJEYXRhIikKCnByaW50KG1vZGVsX2dibTIpCgpwbG90KG1vZGVsX2dibTIpCgp2YXJJbXAob2JqZWN0PW1vZGVsX2dibTIpCgoKIyBQcmVkaWN0IHVzaW5nIHRoZSB0ZXN0IGRhdGEKcHJlZDwtcHJlZGljdChtb2RlbF9nYm0yLHRlc3RTZXQpCgpteV9kYXRhPWRhdGEuZnJhbWUoY2JpbmQocHJlZGljdGVkPXByZWQsb2JzZXJ2ZWQ9dGVzdFNldCR5KSkKCmdncGxvdChteV9kYXRhLGFlcyhwcmVkaWN0ZWQsb2JzZXJ2ZWQpKStnZW9tX3BvaW50KCkrZ2VvbV9zbW9vdGgobWV0aG9kPWxtKStnZ3RpdGxlKCdTdG9jaGFzdGljIEdyYWRpZW50IEJvb3N0aW5nIE1hY2hpbmUnKQoKIyBQcmludCwgcGxvdCB2YXJpYWJsZSBpbXBvcnRhbmNlCnByaW50KHZhckltcChtb2RlbF9nYm0yLCBzY2FsZSA9IEZBTFNFKSkKCnBsb3QodmFySW1wKG1vZGVsX2dibTIsIHNjYWxlID0gRkFMU0UpLCBtYWluPSJTdG9jaGFzdGljIEdyYWRpZW50IEJvb3N0aW5nIikKCgpjb25mdXNpb25NYXRyaXgodGVzdFNldCR5LHByZWQpCgpgYGAKCgoKVGhlIG1heGltdW0gYWNjdXJhY3kgIG9mIDAuOTA1OSBvY2N1cnMgYXQgdGhlc2UgcGFyYW1ldGVyIGNvbWJpbmF0aW9ucyBzaHJpbmthZ2U9MC4wMSxpbnRlcmFjdGlvbi5kZXB0aD0xMCxuLm1pbm9ic2lubm9kZT0xMCBhbmQgbi50cmVlcz0xMDAwLiBUaGUgbWV3IG1vZGVsIHdpbGwgYmUgZml0dGVkIHdpdGggdGhlc2UgcGFyYW1ldGVyIHZhbHVlcy4KCgoKCipOZXVyYWwgbmV0d29ya3MqCgpgYGB7cn0KCm1vZGVsX25uZXQ8LXRyYWluKHRyYWluU2V0WyxUb3BmaXZlcHJlZF0sdHJhaW5TZXRbLG91dGNvbWVOYW1lXSxtZXRob2Q9Im5uZXQiLHRyQ29udHJvbD1maXRDb250cm9sKQoKYGBgCgoKCmBgYHtyfQpzYXZlKG1vZGVsX25uZXQsZmlsZT0ibW9kZWxfbm5ldC5SRGF0YSIpCiAKbG9hZCgibW9kZWxfbm5ldC5SRGF0YSIpCgpwcmludChtb2RlbF9ubmV0KQoKcGxvdChtb2RlbF9ubmV0KQoKdmFySW1wKG9iamVjdD1tb2RlbF9ubmV0KQoKCiMgUHJlZGljdCB1c2luZyB0aGUgdGVzdCBkYXRhCnByZWQ8LXByZWRpY3QobW9kZWxfbm5ldCx0ZXN0U2V0KQoKbXlfZGF0YT1kYXRhLmZyYW1lKGNiaW5kKHByZWRpY3RlZD1wcmVkLG9ic2VydmVkPXRlc3RTZXQkeSkpCgpnZ3Bsb3QobXlfZGF0YSxhZXMocHJlZGljdGVkLG9ic2VydmVkKSkrZ2VvbV9wb2ludCgpK2dlb21fc21vb3RoKG1ldGhvZD1sbSkrZ2d0aXRsZSgnU3RvY2hhc3RpYyBHcmFkaWVudCBCb29zdGluZyBNYWNoaW5lJykKCiMgUHJpbnQsIHBsb3QgdmFyaWFibGUgaW1wb3J0YW5jZQpwcmludCh2YXJJbXAobW9kZWxfbm5ldCwgc2NhbGUgPSBGQUxTRSkpCgpwbG90KHZhckltcChtb2RlbF9ubmV0LCBzY2FsZSA9IEZBTFNFKSwgbWFpbj0iU3RvY2hhc3RpYyBHcmFkaWVudCBCb29zdGluZyIpCgoKY29uZnVzaW9uTWF0cml4KHRlc3RTZXQkeSxwcmVkKQoKYGBgCgoKCgoKYGBge3J9CnN0b3BJbXBsaWNpdENsdXN0ZXIoKQpgYGAKCg==